23 August 2012

What is reward?

So I recently passed my PhD viva and got a paper published (whoop whoop!). The titles of the two texts are ‘Multi-electrode analysis of pattern generation and its adaptation to reward and ‘Multi-neuronal refractory period adapts centrally generated behaviour to reward. That last word, and my use of it in the texts, caused a fair amount of trouble. I naïvely thought it would be OK to leave the term undefined, seeing as we’re still working out how the brain’s reward system operates. For example, the observation that midbrain dopamine neurons are sometimes activated by stimuli purely by virtue of those stimuli being new and unexpected (rather than appetizing, sexual etc) suggests that novelty itself might be thought of as a reward. And anyway we all have a fairly good intuitive understanding of what constitutes a reward, right? Wrong. I need to define reward.

Here’s what we ended up writing in the paper following extended skirmish with reviewers:
"We will refer to… a ‘reward’ in the general meaning of a stimulus that promotes approach and consummatory behaviour rather than the more specific meaning of an unconditioned stimulus used as a positive reinforcer in a classical or operant long-term conditioning paradigm." (Harris et al., 2012)
I'd like to contrast this definition with Wolfram Schultz's, who writes:
"A reward is any object or event that generates approach behavior and consumption, produces learning of such behavior, and is an outcome of decision making." (Schultz, 2007)
Schultz's second condition, that rewards produce learning of approach behaviour and consumption, begs the question: does this refer to conditioning proper, in which memory persists long after the reward is removed, or does an effect on short-term memory suffice? For example, is a food object rewarding merely by virtue of inducing a high and sustained feeding rate, or must it also increase the probability that similar food objects be eaten in the future? This question has physiological consequences: both classical and operant conditioning require brief bursts of spikes in midbrain dopamine neurons (Tsai et al., 2009; Kim et al., 2012), whereas the rate and intensity of ongoing behaviour, and the stability of working memory representations, are regulated by the tonic concentration of dopamine, which is set by the number of dopamine neurons engaged in slow pacemaker firing at any given moment (Niv et al., 2007; Cools & Robbins, 2004). In fact, Schultz's definition of reward does require persistent memory formation, i.e. bursts of dopamine. I disagree. I think a stimulus-induced increase in the rate and intensity of approach and consummatory behaviour can be thought of as a reward-response regardless of whether it produces lasting behavioural change. Yael Niv has for example argued convincingly that the average rate of reward over time modulates tonic background concentrations of dopamine, and thereby adapts the rate and intensity of foraging behaviour (Niv et al., 2007). There are many indications that this extends also to non-food rewards. This view is also in accordance with Norman White's, who writes that rewards are stimuli that elicit approach behaviour whereas reinforcers induce memory consolidation (White, 1989). Roy Wise similarly notes that 'priming' is an important effect of rewards, but one which does not find its way into long-term memory (Wise, 2009).

Schultz's third condition, that rewards be the outcome of decision making, is also problematic. If this condition is taken to mean that a reward must be the consequence of an overt motor behaviour, as many people would argue, then two objections follow. First, cases of classical conditioning where a neutral stimulus is paired with for example food, producing a subsequent preference for the neutral stimulus, do not involve any overt motor behaviour or action and so cannot according to this definition be said to involve reward. This is in stark contrast to numerous papers that describe such experiments as 'classical reward conditioning' and the food stimuli used as rewards. Second, say you give a hungry rat a food pellet, either at a randomly chosen time or as a consequence of the rat wandering into a pre-defined part of the cage. Do we really want to say that the pellet is a reward in the latter case but not in the former? Physiologically there will be no difference: the dopamine burst response and its effect on synaptic plasticity will be the same. Isn't it in fact the case that brains are always in the process of  deciding how to act, and operate by responding to correlations between their own activity states (be they sensory- or motor-states) and varying concentrations of dopamine? Whether or not a reward is in fact the causal outcome of a decision is irrelevant from the perspective of the brain.

In light of all this, I would suggest the following new definition of reward:
A reward is an object or event that induces approach and consummatory behaviour, and produces short- or long-term learning of that behaviour.
The lack of reference to rewards necessarily being the outcome of overt decision making constitutes a deviation from the way the term reward is used in everyday language (for example, an unexpected tax-return is a reward according to this new definition), but not, I think, from the way many scientists use the term. One might argue that such stimuli should be referred to as 'non-contingent rewards', but, at least in the case of the term 'reinforcement', this approach appears only to have complicated matters (Poling & Normand, 1999). Maybe then, we should drop the term reward entirely, and use 'positive stimulus' instead? However, this term has the serious disadvantage of not being a verbal noun. That is, whereas everyone understands the noun 'reward' and the associated verb 'rewarding', there is no established understanding of the (compound) verb 'positively stimulating' that is associated with the (compound) noun 'positive stimulus'. If anything, 'positive' has optimistic or ethical connotations that would jar with the amoral and downright destructive topics often discussed in relation to reward, such as addiction. The term 'appetitive stimulus' (and 'appetitively stimulating') avoids this problem but implies a focus on satisfying bodily needs, particularly hunger, whereas the key property of reward is that it can apply to any desire or goal.

Have I missed something; some word with the same meaning as 'reward' but better able to match the physiology? Is it time to make up a new word? If not, then I would suggest we stick with reward, using the definition above, accepting it as a slight neologism. The lack of a requirement that a reward necessarily be a consequence of overt decision making or motor behaviour should be appropriately tempered by the understanding that in fact the vast majority of rewards do occur as a consequence of decision making and motor behaviour - specifically as the result of exploration, trial-and-error, or more complex goal-oriented behaviours.

And here we see the fuck yeah monkey upon his mountain of treats!
(The treats are all rewards provided the monkey has an appetite)
(h/t Austen)
Post a Comment