A crucial element for the survival of animals and humans is learning how to acquire rewarding stimuli—food, sex, and social rewards. While learning is powerful skill, nothing in the world remains the same for long, and learning must be adaptive in order to allow an animal to flexibly survive a changing environment. Dopamine has long been known for its critical role in cue-reward associations, and new data provide a much richer and complex image of how dopaminergic neurons function.
Making decisions about impending actions requires an understanding of the expected value of outcomes, relative costs incurred between choices, and the probabilities of achieving the possible outcomes. The neural activity of midbrain dopamine neurons is thought to play a crucial role by informing the decision process with the values of known expected outcomes, and after decisions, by informing the animal whether the outcome was better, worse, or as expected. Midbrain dopamine neurons are, therefore, critical for invigorating behaviors aimed at acquiring large expected rewards and for adjusting behaviors when the outcomes of decisions are revealed.
In the late 1990’s, scientists were looking for teaching signals in the brain—mechanisms by which learning may take place. One such teaching signal, proposed in the 1970s by Rescorla and Wagner1, was based on the idea that a fully expected outcome isn’t novel (for example, you walk into a dark room, flick the light switch and a light comes on). For learning to take place, something unexpected must occur. Imagine you have never seen a light switch, and you walk into a dark room, yet you want to turn the lights on. After some fumbling about, you accidentally flick the right switch and—surprise!—the lights turn on. This is a positive outcome, and also unexpected. According to Rescorla and Wagner, because of the difference between the expected outcome (no change, light stays off) and the actual outcome (big change, lights turn on), this should induce a teaching signal. Further, because the difference between expected and actual outcome was in the desired direction, the outcome was appetitive, and this signal is known as a ‘positive prediction error.' Over time, you learn which switch turns the lights on, and this signal should decrease, as you learn the outcome of flicking that light switch. However one day, you walk into this dark room, flick the switch and—surprise!—no lights come on. This outcome is surprising, and negative. This is theorized to induce a ‘negative prediction error’, and if this switch continues over time to not turn the lights on, you will learn to avoid it when trying to do so. These are the basics of prediction error signals, that to see a prediction error, outcomes of actions must be unexpected and they can have either an unexpectedly positive or unexpectedly negative value.
In the late 1990s, an important discovery was made. Dopamine neurons in the midbrain of monkeys were observed to increase firing to unexpectedly positive outcomes (i.e., the unexpected delivery of a reward). This increase was transient, as more trials occurred and the reward was predicted, the firing of dopamine neurons decreased. Then, as this outcome became expected, researchers unexpectedly withheld the reward, and dopamine neurons paused firing2-5. This finding caused a flurry of research and was replicated across species including in humans6, rat7 and mice8. Further, the increases in activity of dopamine neurons to the outcomes shifted, such that a cue which predicted some outcome itself induced firing in dopamine neurons.
Suddenly, scientists had identified a plausible neural mechanism of adaptive learning. More recently, researchers used optogenetics to verify that the correlations of neural activity which people have observed were causative. Steinberg et al (2013) used a blocking paradigm to directly test the role of prediction error signals from dopamine neurons on learning. A blocking paradigm is a scenario where an animal fails to learn a new piece of information, because this new learning is ‘blocked’ by an old association. For example, after an animal learns that a tone predicts an outcome (water), experimenters pair the tone with another cue (light). On test day, just the ‘blocked’ cue (the light) is presented to the animal, to see if the animal learned new information. However, because the outcome is exactly the same for the paired light and tone as it was for the tone alone, there should be no difference in the outcome, and no prediction error generated. By optogenetically activating dopamine neurons during the tone and light pairing, Steinberg et al were able to artificially induce a positive prediction error and create new learning, marking the first causative evidence supporting this theory9. These data solidified the role of midbrain dopamine neurons in signaling positive prediction error, and showed how these prediction error signals can drive behavior.
The Unanswered Questions
At the same time as the role of midbrain dopamine neurons in signaling prediction errors was being confirmed, other experimental data was developing which suggested the real picture was a little bit less neat. In some recording experiments, some neurons also increased firing for unexpected negative and positive outcomes, contrary to the results described above. These neurons are thought to signal salience, in that an unexpected event, regardless of valence, is possibly highly important and behavior will need to be invigorated to either repeat the actions needed to replicate this unexpected positive outcome (in the event of something appetitive) or escape/avoid this in the future, as in the case of something aversive. These results were recently identified in behaving primates, working for either a fluid reward or working to avoid a puff of air to the eye10. Additionally, the locations of dopamine neurons which signal value, salience, and prediction error are not exactly discrete, but instead have a relatively large area of overlap11. Whether this overlap is a function of neurodevelopment or functional experience remains to be seen.
So what role do midbrain dopamine neurons play in decision making? It appears that some neurons are crucial for providing an alerting signal, identifying when something happened that was unexpected, either appetitive or aversive in nature. At the same time, there are neurons which simultaneously provide the relative valence of this unexpected outcome, signaling whether something good or bad occurred. These two neural populations must work in concert to provide a teaching signal, helping animals learn what cues are predictive of what outcomes, and guide behavior towards the more appetitive options on future decisions. Together, phasic activity of midbrain dopamine neurons serve to alert, teach, and inform, all fundamentally critical functions necessary for an organism to live an adaptable life.
- 1. Rescorla RA & Wagner AR (1972) Classical Conditioning II: Current Research and Theory (Appleton-Century-Crofts. Eds A. Black & W.F. Prokasy) 64–99.
- 2. Schultz W (1998) Predictive reward signal of dopamine neurons. Journal of Neurophysiology 80:1–27.
- 3. Schultz W (1999) The Reward Signal of Midbrain Dopamine Neurons. News in physiological sciences : an international journal of physiology produced jointly by the International Union of Physiological Sciences and the American Physiological Society 14(6):249–255.
- 4. Schultz W, Dayan P, Montague PR (1997) A neural substrate of prediction and reward. Science 275(5306):1593–1599.
- 5. Schultz W & Dickinson A (2000) Neuronal coding of prediction errors. Annual Review of Neuroscience 23:473–500. doi: 10.1146/annurev.neuro.23.1.473
- 6. D'Ardenne K et al. (2008) BOLD responses reflecting dopaminergic signals in the human ventral tegmental area. Science 319(5867):1264–1267. doi: 10.1126/science.1150605
- 7. Roesch MR, Calu DJ, Schoenbaum G (2007) Dopamine neurons encode the better option in rats deciding between differently delayed or sized rewards. Nature Neuroscience 10:1615–1624. doi: 10.1038/nn2013
- 8. Cohen JY et al. (2012) Neuron-type-specific signals for reward and punishment in the ventral tegmental area. Nature 482:85–88. doi: 10.1038/nature10754
- 9. Steinberg EE et al. (2013) A causal link between prediction errors, dopamine neurons and learning. Nature Neuroscience doi: 10.1038/nn.3413
- 10. Matsumoto M & Hikosaka O (2009) Two types of dopamine neuron distinctly convey positive and negative motivational signals. Nature 459:837–841. doi: 10.1038/nature08028
- 11. Bromberg-Martin ES, Matsumoto M, Hikosaka O (2010) Dopamine in motivational control: rewarding, aversive, and alerting. Neuron 68:815–834. doi: 10.1016/j.neuron.2010.11.022