INTRODUCTION

Although the acute effects of drugs of abuse, such as cocaine, act to increase the levels of dopamine (DA) in the NAc (Aragona et al, 2008; Di Chiara and Imperato, 1988; Stuber et al, 2005), repeated self-administration of cocaine has been shown to induce persistent changes in DA signaling and DA-linked behaviors, even after prolonged abstinence from drug-taking. For example, chronic cocaine use is associated with alterations in synaptic transport of DA (Addy et al, 2010; Calipari et al, 2012; Ferris et al, 2012), augmentations in phasic DA signals to salient rewarding stimuli (Saddoris et al, 2016; Willuhn et al, 2014), and impairments in value-based behaviors, such as outcome devaluation (LeBlanc et al, 2013; Schoenbaum and Setlow, 2005) and reversal learning (Calu et al, 2007; Jentsch et al, 2002; Schoenbaum et al, 2004). Collectively, such drug-associated behavioral changes suggest that intact DA signaling may be important for aspects of reward encoding and that chronic experience with cocaine disrupts this process.

Much is known about the role of phasic DA signals during learning and action, though DA’s role in reward processing has been largely in the context of associative tasks. For example, early in learning when the cue–outcome association is poorly learned, DA activity is greatest at the time of reward receipt though this signal often ‘shifts’ to predictive stimuli with learning (Day et al, 2007; Schultz et al, 1997). This pattern of encoding at reward is thought to represent a reward prediction error (RPE), rather than the value of the reward itself (Hart et al, 2014; Saddoris et al, 2015a; Sugam et al, 2012; Waelti et al, 2001). Recent findings have shown that manipulation of the DA signal during the reward phase of learning can alter animals’ valuation of the reward. For example, either optogenetic stimulation or inhibition of DA neurons in the ventral tegmental area (VTA) during reward receipt is sufficient to alter outcome sensitivity in a Pavlovian blocking and unblocking tasks (Chang et al, 2016; Steinberg et al, 2013). Relatedly, DA signals during reward receipt is disrupted in cocaine-experienced animals, with subjects displaying abnormally large reward-related DA release in the NAc core where it is normally low and virtually none in the shell where it was normally present (Saddoris et al, 2016).

However, less is known about how DA might encode information about rewards when they are unsignaled by discrete stimuli. This distinction may be important for dissociating the DA signal for rewards from a ‘teaching signal’ in RPE models (Day et al, 2007; Saddoris et al, 2015a; Schultz et al, 1997) from the experience of reward consumption itself. For example, unsignaled reward delivery has been shown to elicit phasic DA release in the NAc core (Nasrallah et al, 2011). In primates, putative DA neurons displayed a multiphasic signal to unsignaled rewards composed of a rapid non-selective peak response and a slower postpeak epoch that discriminated reward magnitude (Stauffer et al, 2014). Thus the DA signal for unsignaled rewards is complex and, in important ways, fails to recapitulate the more temporally precise and value-modulated peak DA response in relation to reward-predictive stimuli (Day et al, 2010; Gan et al, 2010; Saddoris et al, 2015b; Sugam et al, 2012).

Although previous work suggests that shell may be more biased toward reward encoding in learning tasks (Beyene et al, 2010; Cacciapaglia et al, 2012; Saddoris et al, 2015a; Saddoris et al, 2013; Stopper and Floresco, 2011), no studies have compared core and shell DA signaling for unsignaled rewards. Here we measured rapid DA release in the core and shell of drug-naive and cocaine-experienced rats while they received unsignaled presentations of rewards of different magnitudes (1 pellet or 2 pellets) and later tested their sensitivity to reward magnitude in a Free Choice task.

MATERIALS AND METHODS

Subjects

Male Sprague-Dawley rats (n=25) were used. During all phases of the experiment, single-housed rats were allowed ad libitum access to water in their home cages and maintained on a 12 : 12 light:dark schedule. All subjects were previously trained in appetitive conditioning experiments (see Supplementary Materials). Experiments were performed in accordance with UNC Chapel Hill Institutional Animal Care and Use Committee protocols.

Behavior

Self-administration

Detailed descriptions of this task appear elsewhere (Saddoris and Carelli, 2014; Saddoris et al, 2016). Briefly, at least 1 month prior to testing, a subset of rats (n=19) were implanted with intrajugular catheters. Following recovery, rats were randomly assigned to either the intravenous cocaine self-administration group (Cocaine; n=7) or water self-administration group (Control; n=12) and water deprived (20 ml/day plus any fluids obtained during session) for the duration of self-administration training. All self-administration sessions were performed in a standard rat chamber (Context A: 25 × 25 × 30 cm3, stainless steel rod floor, steel side walls, beige sound-attenuating cabinet interior; MED Associates, St Albans, VT). For the Cocaine subjects (Figure 1a), presses on a lever below an illuminated cue light resulted in an infusion of intravenous cocaine (0.33 mg/infusion; ~1 mg/kg) coupled to a 20 s presentation of a houselight and intermittent tone, extinguishing of the cue light, and retraction of the lever. For the Controls, presses on the lever under the illuminated cue light resulted in the same stimuli (houselight/tone, lever retraction), but rats received water (250 μl) delivered to a centrally located foodcup as the reinforcer. Controls also received yoked saline infusions based on the self-administration schedule of a rat in an adjacent box. Both groups were allowed to press for 2 h per session for 14 sessions. Following this, all rats entered a period of enforced abstinence for 30 days by remaining in their home cages in the colony room with ad libitum access to food and water.

Figure 1
figure 1

(a) Schematic of behavioral training involving self-administration. Sessions were 2 h per day for 14 days. Cocaine rats self-administered intravenous cocaine (0.33 mg/infusion) along with a 20 s tone/houselight stimulus by pressing a lever. Controls pressed a lever for self-administered water to a foodcup along with the tone/houselight stimulus and received yoked intravenous 0.9% saline infusions. All rats (including a subset without self-administration experience) underwent 30 days of abstinence before performance on the Unsignaled Pellet Task. After this, a group of rats performed a Free Choice instrumental task. (b) Schematic of the Free Choice task in which presses on one lever produced 1 pellet and presses on the other lever produced 2 pellets. Rats received 20 Lever A only trials, 20 Lever B only trials, and 30 Free Choice trials where both levers were available and reinforced based on presses for the chosen lever’s assigned reward magnitude. (c) Mean self-administration rates of pressing over the 14 days of training for rats pressing for water (Controls; open circles; n=12) or intravenous cocaine (Cocaine; black squares; n=7). Error bars represent±SEM. *P<0.05, Controls vs. Cocaine. (d) Histology of recording sites in the core (squares) and shell (circles) in Controls (black) and Cocaine subjects (gray).

PowerPoint slide

Unsignaled Pellet Task

Several days prior to recording, rats were lightly food deprived (15–18 g chow/day) to 90–95% of their free feeding weight and maintained on that schedule for the duration of all testing. Rats were subsequently tested in a custom voltammetric behavioral chamber (Context B: 43 × 43 × 53 cm3, smooth Plexiglas floors and clear Plexiglas walls, copper mesh Faraday cage sound-attenuating chamber interior; MED Associates) that was located in a separate room from the self-administration chambers and easily discriminable from that context (see description of self-administration chambers above). Rats received unsignaled deliveries of either 1 Pellet (45 mg sucrose, Purina Test Diet) or 2 Pellets delivered to the foodcup via a food hopper (MED Associates) located outside of the chamber. Rats received 15 trials of each type (1 Pellet or 2 Pellets), with a variable ITI (30±10 s) between trials, with the order of 1-Pellet and 2-Pellet trials randomly intermixed (Figure 1a).

Voltammetry

Voltammetric recordings taken during the Unsignaled Pellet Task were identical to those described previously using TarHeel CV for acquisition, and HDCV Analysis for signal processing (UNC, Chapel Hill) (Saddoris et al, 2015a). See Supplementary Methods for detail.

To isolate the kinetics of DA signaling, we compared reward-elicited DA to release patterns driven by electrical stimulation of VTA afferents. These were generated for each subject in the course of developing a training set specific for each electrode and at each recording location (Rodeberg et al, 2015). These electrically generated stimulations allowed for comparison of reward-evoked DA events to the typical clearance rate owing to transporter-based kinetics in the recorded region (see Supplementary Methods for more detail). To ensure that clearance dynamics are similar between cues and electrical stimulations, we only used stimulations that had a peak release (within 1 s following stimulation) of less than either 150 nM (core) or 100 nM (shell) above baseline, similar to previous work (Saddoris et al, 2015a).

Data from FSCV recordings were measured with multiple metrics (see Supplementary Methods). Briefly, we analyzed peak (largest [DA] within a defined phase) and total DA (summated [DA] across each 100 ms bin within a defined phase) for both reward magnitudes and the aligned electrical stimulation. Defined phases included Early (0–4 s following reward delivery/stimulation) or Late (4–8 s following reward/stimulation). To ensure reliable comparisons between groups on measures of peak and area under the curve, we background subtracted the average baseline from each trial.

Magnitude Choice Task

At least 1 day following the Unsignaled Pellet Task, a set of subjects with the same experience with the Pellet Task were subsequently run on a Magnitude Choice Task (n=12; Figure 1b). This task was carried out in chambers that were different from the Unsignaled Pellet Task (Context C); voltammetric recordings were not taken during these sessions. All subjects were from the self-administration set, and we obtained behavioral sessions from eight Controls and four Cocaine rats. In these sessions, rats were first shaped such that one lever (eg, left) was associated with delivery of 1 Pellet on a fixed ratio 1 (FR1) schedule. When the rat performed 50 presses within one 1-h session, the session terminated, and the next day, the opposite lever (eg, right) was presented (FR1, 1 Pellet per press). The session then ended after 50 presses within a 1-h session. If the rat failed to reach 50 presses within the hour, it was run on the same contingency the next day. Finally, rats were trained on a discrimination task. One lever was designated the 1-Pellet lever while the opposite lever was the 2-Pellet lever. Assignment of magnitude for the levers was counterbalanced across subjects. For each session, rats received 40 Forced Choice trials (20 each 1 Pellet and 2 Pellet), where one lever was extended into the chamber. Presses on the lever within 10 s (FR1) resulted in delivery of the assigned magnitude reward (either 1 or 2 pellets) to the foodcup, while failure to press within 10s was an error. Within the same session, rats then had Free Choice trials where both the 1-Pellet and 2-Pellet levers were simultaneously extended into the chamber, and presses on the lever delivered the associated reward. Rats were run on this same task for 4 consecutive days.

Statistics

For all analyses, we used a factorial analysis of variance first within a region and stimulus type (1 Pellet, 2 Pellet, and stimulation), and then separately to explore the effects of Drug (Control vs Cocaine) and Region (Core vs Shell Controls). All post-hoc tests used Tukey’s HSD to control for multiple comparisons. Effect size was reported using partial eta squared (η2), which corresponds to the proportion of variability attributable to each factor independent of sample size. Data were included for animals that showed appropriate self-administration (at least 10 presses per day during the last 5 days of self-administration within the 2-h sessions) and from which we successfully obtained FSCV recordings. Because some trials were excluded owing to ‘glitches’ (ie, violations of Qα owing to mechanical/electrical error or unknown sources of deterministic noise), we obtained recordings from trials in Controls (n=364 core, 188 shell) and Cocaine subjects (n=104 core, 92 shell).

RESULTS

Behavior

For self-administration (Figure 1c), rats in both the Control and Cocaine groups showed similar amounts of lever pressing by the end of the 14-day self-administration sessions (see Supplementary Materials for details).

Histology

Placement of electrode tips for recordings (Control Core, n=12; Control Shell, n=7; Cocaine Core, n=4; Cocaine Shell, n=3) are shown in Figure 1d.

NAc Core Encodes Reward Receipt but not Magnitude

During the Unsignaled Pellet Task, trial-averaged DA traces in the core for Controls showed an increase in DA release immediately following pellet deliveries similar in peak and clearance kinetics to an electrical stimulation but then showed elevated DA levels above the electrical stimulation several seconds after reward consumption (Figure 2a). We found a main effect of Stimulus Type (1 Pellet, 2 Pellet, electrical stimulation), F(2, 360)=14.41, P<0.00001, η2=0.074, and a significant interaction between Stimulus × Phase (baseline, Early, Late), F(4, 720)=10.89, P<0.00001, η2=0.057 (Figure 2b). All three stimuli showed nearly identical peak DA in the early Phase (all P>0.68), whereas, in contrast, peak DA was significantly greater for both the 1-Pellet and 2-Pellet trials than for the electrical stimulation during the Late Phase (both P<0.00002), while not differing from each other during that same period (P>0.90). Total DA release showed the same pattern of results (Figure 2c; Stimulus Type × Phase, F(4, 720)=6.67, P<0.0001, η2=0.036; 1 Pellet vs 2 Pellets, P=0.97; see Supplementary Results for details).Thus phasic DA in the core encoded a biphasic response to reward delivery: a non-specific early peak followed by a slower postpeak response that detected reward delivery but not magnitude.

Figure 2
figure 2

Voltammetric recordings of real-time DA release taken in Controls during the Unsignaled Pellet Task in the NAc core (a–c; n=346 trials) and shell (d–f; n=188 trials). (a) Average traces of DA release in the core aligned to the delivery of 1 Pellet (blue), 2 Pellets (green), or electrical stimulation of the VTA (gray). Shaded blue region indicates the Early Phase of the DA signal (0–4 s following reward delivery/stimulation) and shaded green region indicates the Late Phase of the DA signal (4–8 s following reward/stimulation). (b) Peak DA (greatest concentration of phasic release) for individual trials within the Early (left bars) and Late (right bars) Phases of the DA signal in the NAc core. (c) Cumulative (summed) DA during the Early (left bars) and Late (right bars) Phases of the DA signal in the NAc core. (d) Average traces of DA release in the shell aligned to the delivery of 1 Pellet (red), 2 Pellets (orange), or electrical stimulation of the VTA (gray). (e) Peak DA within the Early and Late Phases of the DA signal in the NAc shell. (f) Cumulative DA during the Early and Late Phases of the DA signal in the NAc shell. Error bars represent ±SEM *P<0.05, 1 Pellet vs 2 Pellets; †P<0.0001, Electrical stimulation vs 1 Pellet and vs 2 Pellets. NS, not significant.

PowerPoint slide

DA Release in the NAc Shell Encodes Reward Magnitude

Next we examined how DA release in the NAc shell encoded information about unsignaled rewards (Figure 2d). Unlike the core, shell DA was more sensitive to reward magnitude as indicated by a significant main effect of Stimulus Type, F(2, 172)=4.57, P=0.02, η2=0.050, and a significant interaction between Stimulus Type × Phase, F(4, 344)=8.70, P<0.00001, η2=0.092 (Figure 2e). In the Early Phase, none of the stimuli differed by peak (all P>0.76). However, during the Late Phase, the 2-Pellet trials elicited greater DA than both the 1-Pellet trials (P=0.01), while both Pellet trials were greater than electrical stimulation (P<0.003). Consistent with this, Late-Phase DA for both Pellet trials were greater than baseline (P<0.0001) while electrical stimulation was similar to baseline (P=0.73). Cumulative DA signals were similar to the findings of peak (Figure 2f; Stimulus Type × Phase, F(4, 344)=8.91, P<0.00001, η2=0.094; 1 Pellet vs 2 Pellets, P=0.001 see Supplementary Materials). Thus, unlike the core, DA release in the shell was sensitive to the size of unexpected rewards but encoded this information during a late phase of the DA signal several seconds after reward delivery.

Finally, we compared core and shell reward magnitude (1 Pellet vs 2 Pellets) encoding in the Controls. For peak DA, there was a main effect of Region, F(1, 464)=44.30, P<0.00001, η2=0.084, and an interaction of both Phase × Region, F(2, 928)=19.46, P<0.00001, η2=0.040, and Phase × Region × Magnitude, F(2, 298)=3.08, P=0.047, η2=0.007. Direct comparisons between regions indicated lower Early-Phase DA in the shell than in the core for both 1 Pellet (P=0.00002) and 2 Pellet (P=0.0001) trials, but during the Late Phase, shell DA was less than core for the 1-Pellet trials (P=0.0007) but not for 2-Pellet trials (P=1.0). Using AUC, we again found a significant interaction of Phase × Region × Magnitude, F(2, 298)=5.13, P=0.006, η2=0.011. Post-hoc comparisons indicated that cumulative DA between regions was not different during either BL or the Early Phase (both P=1.0) but was significantly greater for the 2-Pellet than for the 1-Pellet trials in the Late Phase (P=0.02).

Cocaine Disrupts Phasic DA Release Patterns for Unsignaled Rewards

We then examined reward-related DA signals in animals with a history of cocaine self-administration using trial-averaged data. Unlike Controls, cocaine-experienced rats displayed DA signals in the core that were sensitive to reward magnitude (Figure 3a). Peak DA in cocaine subjects displayed a significant interaction between Stimulus Type × Phase, F(4, 190)=11.25, P<0.00001, η2=0.192 (Figure 3b). Though peak Early-Phase DA did not discriminate Pellet magnitude (P=1.0), both the 1-Pellet and 2-Pellet trials elicited lower Early-Phase peak DA than electrical stimulation (both P<0.01). In contrast, during the Late Phase, DA levels were reliably greater for the 2-Pellet than for both 1-Pellet trials (P=0.015) and electrical stimulation (P=0.0004), while 1-Pellet DA levels were similar to electrical stimulation (P=0.96). Cumulative DA likewise showed enhanced sensitivity to reward magnitude as revealed by a significant interaction between Stimulus Type × Phase, F(4, 190)=7.46, P<0.0001, η2=0.136, with Late-Phase 2 Pellets greater than both 1 Pellet (P=0.002) and stimulation (P=0.0001) (Figure 3c; see Supplementary Materials).

Figure 3
figure 3

Voltammetric recordings of real-time DA release taken in Cocaine rats during the Unsignaled Pellet Task in the NAc core (a–c; n=104 trials) and shell (d–f; n=92). (a) Average traces of DA release in the core aligned to the delivery of 1 Pellet (blue), 2 Pellets (green), or electrical stimulation of the VTA (gray). (b) Peak DA within the Early and Late Phases of the DA signal in the NAc core. (c) Cumulative DA during the Early and Late Phases of the DA signal in the NAc core. (d) Average traces of DA release in the shell aligned to the delivery of 1 Pellet (red), 2 Pellets (orange), or electrical stimulation of the VTA (gray). (e) Peak DA within the Early (left bars) and Late (right bars) Phases of the DA signal in the NAc shell. (f) Cumulative DA during the Early and Late Phases of the DA signal in the NAc shell. Error bars represent ±SEM **P<0.02, 1 Pellet vs 2 Pellets; †P<0.01, Electrical stimulation vs 1 Pellet and vs 2 Pellets. NS, not significant.

PowerPoint slide

Cocaine experience strongly attenuated DA release in the shell (Figure 3d). There was a main effect of Stimulus Type, F(2, 82)=5.31, P=0.007, η2=0.115, though the interaction between Stimulus Type × Phase did not reach significance, F(4, 164)=1.95, P=0.105, η2=0.045. Post-hoc comparisons indicated that electrical stimulation elicited significantly greater peak DA during the Early Phase than both the 1 Pellet (P=0.04) and the 2 Pellet (P=0.003) conditions, though the Pellet conditions did not differ from each other (P=0.94). During the Late Phase, there were no differences in peak DA between any of the Stimulus Types (all P>0.69). Peak DA was reliably greater than the average baseline for all Stimulus Types during the Early Phase (Tukey: all P<0.005) and Late Phase (Tukey: all P<0.02). Total DA levels (Figure 3f) were more strikingly affected by cocaine experience than peak. A main effect of Stimulus Type, F(2, 82)=6.0, P=0.004, η2=0.127, and a significant interaction between Stimulus Type × Phase, F(4, 164)=5.48, P<0.0005, η2=0.118, was found, and post-hoc comparisons indicated no differences between the 1-Pellet and 2-Pellet conditions during any Phase (all P>0.99). Indeed, total DA levels were no different than baseline for any signal phase for either Pellet type, suggesting the abolishment of stimulus-related phasic release dynamics in cocaine-experienced rats (all P>0.99). In contrast, electrical stimulation elicited greater DA during the Early Phase than both of the Pellet types and its own baseline (all P<0.0005).

Direct comparisons between Controls and Cocaine subjects on the measures of DA by region showed that in general cocaine experience blunted phasic DA signals in both the core (main effect Drug, F(1, 458)=43.57, P=0.00001, η2=0.087; interaction of Drug × Phase × Stimulus Type, F(4, 916)=5.82, P=0.0001, η2=0.025) and shell (main effect Drug, F(1, 254)=16.70, P<0.00001, η2=0.062; interaction Drug × Phase × Stimulus, F(4, 508)=3.66, P=0.006, η2=0.026; Supplementary Figure S1). Importantly, we further show that subject-averaged data largely mirrored our trial-averaged data both in DA response kinetics and peak/AUC measures of signaling (Supplementary Figure S2; see Supplementary Materials for more detail).

Behavioral Choice for Reward Magnitude Impaired by Cocaine Experience

Given the lack of differential encoding particularly during the early phase for different reward magnitudes, it was unclear whether rats were sensitive to the differences in reward magnitude between the 1-Pellet and 2-Pellet trials. To test whether subjects were able to discriminate between reward size, we trained rats to perform a behavioral choice task where one lever was linked to 1-pellet delivery, while presses on another lever delivered 2 pellets. Drug history had no effect on Forced Choice trials; both Controls and Cocaine subjects made few errors (Figure 4a) and there was no main effect of Drug, F(1, 46)=0.49, P=0.49, η2=0.011 or interaction of Drug × Accuracy (ie, Correct, Error), F(1, 46)=2.71, P=0.11, η2=0.056. However, during Free Choice trials where both levers were presented and rats could choose their preferred option, cocaine experience significantly impaired the rats’ ability to consistently select the larger option (Figure 4b). A significant interaction between Drug × Magnitude, F(1, 46)=8.30, P=0.006, η2=0.153, revealed that Controls significantly preferred the larger reward option, (P=0.0004), while Cocaine subjects failed to distinguish between the levers (P=0.99). Further, while Controls selected the 2-Pellet lever (66%) vs the 1-Pellet lever (33%) at a ratio of almost exactly corresponding to the relative reward magnitudes for those levers, Cocaine subjects chose both levers at a rate that was not different from chance. Controls chose the Large reward lever significantly more often than Cocaine subjects (P=0.03).

Figure 4
figure 4

Behavioral performance on the Magnitude-based Free Choice discrimination task in Controls (n=32 sessions) and Cocaine subjects (n=16 sessions). When only one lever was available on the Forced Choice trials (a), there was no difference in average accuracy between the Cocaine and Control groups. During the Free Choice trials where both levers were available (b), Controls on average chose the Large magnitude lever significantly more than the Small magnitude lever, while Cocaine rats failed to discriminate between the levers. Dashed line shows chance level preference. Error bars represent ±SEM **P<0.0001, Large vs Small choice; †P<0.05, Control vs Cocaine.

PowerPoint slide

DISCUSSION

Here we demonstrate that, in normal animals, DA signals encoding information about rewards are multiphasic: an early large and non-selective phase immediately following reward, and a slower reward-discriminative phase that was selective to the shell. Following cocaine experience, DA signals for unsignaled reward size was strikingly altered: despite decreased peak, core signals discriminated reward magnitude, while DA signals in the shell were essentially abolished. However, despite reward-discriminative DA signaling in the core, Cocaine rats failed to reliably select the larger reward in a Free Choice task.

Typically in associative learning, phasic DA immediately following cue presentation scales with anticipated reward value (Day et al, 2010; Jones et al, 2010; Ostlund et al, 2014; Saddoris et al, 2015b; Sugam et al, 2012; Wassum et al, 2013). However, DA release for unsignaled rewards in the present study followed a different pattern, as Early-Phase DA in the present study did not distinguish reward value in either the core or the shell. Indeed, it has been argued that this generic early component of DA impulse activity may alert the animal to an unexpected stimulus prior to identifying its value (Schultz, 1998, 2015), though it is possible that some irreducible aspects of the task (food hopper activation, sound of pellet arriving in foodcup) may contribute to this signal. In contrast, the Late-Phase component of the DA signal to rewards appears to encode value-related information. In both the core and the shell, the kinetics of the DA signal sharply diverged approximately 4s after the reward delivery relative to a normal clearance rate (here modeled with the electrical stimulation of VTA afferents). Although in the core, Late-Phase DA failed to discriminate reward value despite remaining above the reuptake kinetics of an electrical stimulation, Late-Phase shell DA was generally not only elevated for both reward types but also displayed additional increased DA levels for the larger-value reward. As such, this Late-Phase component in the shell likely reflects an evaluative aspect of the food following its unexpected delivery.

Putative DA neurons in the midbrain were recently shown to encode a similar biphasic signal to unsignaled rewards, with rapid early responses to rewards that were identical regardless of reward value and a slower evaluative phase that differentially encoded the utility value of the reward (Stauffer et al, 2014). Thus there is a remarkable resemblance (though on a faster timescale) between recorded midbrain DA neurons and phasic DA release in the NAc shell, but not NAc core. This provides further evidence that while DAergic neural activity and subsequent DA release often correlates, there are likely additional neuroanatomical or modulatory components of these pathways that can produce highly distinct DA release patterns between terminal fields (Cacciapaglia et al, 2012; Saddoris et al, 2015a; Saddoris et al, 2013).

Cocaine experience substantially altered DAergic reward encoding. In both core and shell, peak phasic Early-Phase DA release decreased relative to Controls, though, similar to Controls, failed to distinguish reward size. Surprisingly, core Late-Phase DA signals in Cocaine subjects showed an enhanced ability to distinguish reward magnitudes, while shell Late-Phase DA was essentially abolished. In a previous report, we showed a similar pattern of encoding during a Pavlovian task in which core DA was enhanced during signaled rewards, while associative encoding was abolished in the shell (Saddoris et al, 2016). One explanation for this could be a cocaine-mediated shift in the mesolimbic DA system signals along a dorsolateral gradient, such that previous shell-related signals are now being represented in the core following chronic self-administration. Previous work has indicated that cocaine experience can drive encoding from the NAc into the dorsal striatum (Takahashi et al, 2007; Willuhn et al, 2014), while here this shift may be similarly represented within the NAc itself. It should be further noted that we did not see differences in electrically stimulated DA levels between groups, suggesting deficits linked to appropriate coding rather than loss of DA function generally.

Our initial recordings in this experiment were taken in the core, and we were surprised that DA signals in that region failed to encode magnitude, particularly as an earlier study using a similar approach obtained a different result (Nasrallah et al, 2011). We thus used the Free Choice task to assess whether these animals were able to attend to small differences in reward magnitude despite a lack of discriminative encoding. Indeed, Controls not only preferred the 2-Pellet option but chose the larger option at a ratio that was nearly identical to the relative reward values, consistent with Matching Law principles (Mackintosh, 1974). In contrast, Cocaine animals chose the levers essentially equally, despite evidence that DA in the core retained the ability to encode relative reward value.

In conclusion, these findings suggest that shell DA may be particularly important for magnitude-based encoding and that somewhat ‘preserved’ DA encoding in the core following cocaine experience is ineffective at supported value-based representations for action. Future investigations will explore the functional relevance of this altered DA signal following abstinence from cocaine self-administration.

FUNDING AND DISCLOSURE

This work was supported by National Institutes on Drug Abuse grant DA035322 and University of Colorado startup funds to MPS and DA034021 to RMC. Cocaine used in these experiments was generously provided by the NIDA Drug Supply Program. The authors declare no conflict of interest.