Neural computations associated with goal-directed choice
Introduction
Consider a canonical decision-making problem. Every day a hungry animal is placed at the bottom of a Y-maze and is allowed to run towards the upper left or right to collect a reward. The left arm leads to a highly liked food, but is also associated with a high cost since the animal is required to swim to reach it. The right end leads to a less desirable outcome, but does not require swimming. The foods randomly change each day. How does the animal decide which course to take?
A growing body of work has shown that this problem can be solved using two very different approaches [1, 2, 3, 4]. In one approach animals learn the value of each action through trial-and-error using reinforcement learning, and then take the action with the highest learned value [2, 4, 5, 6, 7, 8, 9]. This strategy requires little knowledge on the part of the subject and can account for multiple aspects of behavior in many domains, but is only able to pick the optimal action on average. In another approach, animals estimate the value associated with each action in every trial using knowledge about their costs and benefits. With sufficient knowledge this approach, often called ‘goal-directed’ or ‘model-based’ decision-making [7, 8, 10], can do much better since it is able to pick the optimal action in every trial [10].
Over the past decade significant advances have been made in understanding how the brain makes goal-directed choices. We review important findings from the past few years, as well as some of the most pressing open questions. Owing to space limitations we do not attempt to be comprehensive.
Section snippets
Computational framework
Models from psychology and economics suggest that goal-directed choice requires the following computations. First, the brain computes stimulus values that measure the value of the outcomes generated by each action. Second, it computes action costs that measure the costs associated with each course of action. Third, it integrates them into action values given by
Finally, the action values are compared in order to make a choice.
We now describe what is known
How are stimulus values encoded?
Several human fMRI studies have placed individuals in simple choice situations and have found that BOLD activity in the medial orbitofrontal cortex (mOFC) correlates with behavioral measures of stimulus values [11, 12, 13•]. These findings are consistent with monkey neurophysiology studies that have found stimulus value coding in OFC neurons during choice tasks [14, 15••, 16•, 17] (Figure 1). Note, however, that we must be cautious when comparing OFC findings across species owing to potential
How are stimulus values computed?
A popular theory states that stimulus values are learned through reinforcement learning and retrieved in OFC at the time of choice [2, 9, 29]. Although some evidence suggests that this process is at work in settings where animals repeatedly face a small number of stimuli [30], it cannot account for all observed behavior because humans are able to evaluate novel stimuli. We propose an alternative theory of stimulus value computation that takes advantage of the fact that most stimuli are complex
Stimulus valuation in complex decision situations
Two recent human fMRI studies provide clues about how the brain has adapted to solve more sophisticated choice problems, such as dietary decisions with long-term consequences, or complex social decisions. In order to make good choices in these domains, the brain needs to compute the value of attributes such as the impact of the choice on future health, or on others’ well-being. Hare et al. [33••] studied dietary choices that involve self-control. Subjects made choices between stimuli that
How are action costs encoded and computed?
Almost every choice we make has costs associated with it. These costs come in two types. First are the costs of the actions required to obtain the stimuli, such as effort. Second are aversive stimuli that are bundled with the desired outcome. For example, purchasing a book requires giving up money. The key distinction between them is whether the cost is tied to the action or to the outcome. The distinction is meaningful because, for example, one can decrease the effort costs associated with
How are action values encoded and computed?
The computational model described above predicts that there should be neurons encoding each of the action values, regardless of whether the action is taken or not. Samejima et al. [42] recorded from striatal neurons during a probabilistic binary choice task and found neurons encoding the action values. In a closely related study Lau and Glimcher [43] found that about 60% of phasically active neurons in the caudate encoded either the value of particular actions (early in the trial) or the value
How are action values compared to make a choice?
The final stage in making a decision involves the comparison of the action values in order to make a choice. A significant amount of behavioral evidence suggests that the mapping from action values to choices is stochastic and follows a soft-max (or logistic) functional form, and that there is a speed-accuracy tradeoff. Two of the most important open questions in the field have to do with how the stochastic choice process is implemented: What exactly is the algorithm used by the brain to
Conclusions
Throughout the review we have emphasized a multitude of important and pressing open questions. However, it is important not to lose sight of the progress that has been made. We now know that OFC neurons encode stimulus values in a wide variety of contexts and that values are sensitive to internal physiological and cognitive states. We know that stimulus value signals respond to variables such as delay and risk in ways that are consistent with theories from behavioral economics. We know that
References and recommended reading
Papers of particular interest, published within the annual period of review, have been highlighted as:
• of special interest
•• of outstanding interest
Acknowledgements
Support of the NSF (AR3.SELFCNTRL-1-NSF.ARR1) and the Betty and Gordon Moore Foundation is gratefully acknowledged.
References (73)
Neural bases of food-seeking: affect, arousal and reward in corticostriatolimbic circuits
Physiol Behav
(2005)- et al.
Goal-directed instrumental action: contingency and incentive learning and their cortical substrates
Neuropharmacology
(1998) - et al.
The misbehavior of value and the discipline of the will
Neural Netw
(2006) - et al.
Architectonic subdivision of the human orbital and medial prefrontal cortex
J Comp Neurol
(2003) - et al.
Psychology and neurobiology of simple decisions
Trends Neurosci
(2004) - et al.
A comparison of sequential sampling modles for two-choice reaction time
Psychol Rev
(2004) - et al.
The physics of optimal decision making: a formal analysis of models of performance in two-alternative forced choice tasks
Psychol Rev
(2006) - et al.
The basal ganglia and cortex implement optimal decision making between alternative actions
Neural Comput
(2007) - et al.
Perceptual decisions between multiple directions of visual motion
J Neurosci
(2008) - et al.
The basal ganglia: a vertebrate solution to the selection problem?
Neuroscience
(1999)
Cortical substrates for exploratory decisions in humans
Nature
Evidence for a common representation of decision values for dissimilar goods in human ventromedial prefrontal cortex
J Neurosci
Response of neurons in the lateral intraparietal area during a combined visual discrimination reaction time task
J Neurosci
Multiple forms of value learning and the function of dopamine
A framework for studying the neurobiology of value-based decision making
Nat Rev Neurosci
The role of value systems in decision making
A theory of Pavlovian conditioning: variations in the effectiveness of reinforcement and non-reinforcement
Adaptive critic in the basal ganglia
Theoretical and empirical studies of learning
Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control
Nat Neurosci
Orbitofrontal cortex encodes willingness to pay in everyday economic transactions
J Neurosci
Determining the neural substrates of goal-directed learning in the human brain
J Neurosci
Dissociating the role of the orbitofrontal cortex and the striatum in the computation of goal values and prediction errors
J Neurosci
Neurons in the orbitofrontal cortex encode economic value
Nature
Range-adapting representation of economic value in the orbitofrontal cortex
J Neurosci
The representation of economic value in the orbitofrontal cortex is invariant for changes of menu
Nat Neurosci
Neuronal activity in primate dorsolateral and orbital prefrontal cortex during performance of a reward preference task
Eur J Neurosci
General mechanisms for making decisions?
Curr Opin Neurobiol
Choice, uncertainty and value in prefrontal and cingulate cortex
Nat Neurosci
The organization of networks within the orbital and medial prefrontal cortex of rats, monkeys and humans
Cereb Cortex
Midline and intralaminar thalamic connections with the orbital and medial prefrontal networks in macaque monkeys
J Comp Neurol
Complementary circuits connecting the orbital and medial prefrontal networks with the temporal, insular, and opercular cortex in the macaque monkey
J Comp Neurol
The neural basis of loss aversion in decision-making under risk
Science
Prospect Theory: an analysis of decision under risk
Econometrica
The neural representation of subjective value under risk and ambiguity
J Neurophysiol
The neural correlates of subjective value during intertemporal choice
Nat Neurosci
Cited by (414)
Understanding anxiety symptoms as aberrant defensive responding along the threat imminence continuum
2023, Neuroscience and Biobehavioral ReviewsThe orbitofrontal cortex: A goal-directed cognitive map framework for social and non-social behaviors
2023, Neurobiology of Learning and MemoryThe utility of goods or actions? A neurophilosophical assessment of a recent neuroeconomic controversy
2023, Economics and Philosophy