The Contribution of the Amygdala to Reward-Related Learning and Extinction

There has been substantial research into the role of the amygdala in fear conditioning and extinction of conditioned fear. The role of the amygdala in appetitive conditioning is relatively less explored. Here, we will review research into the role of the amygdala in reward‐related learning. Research to date suggests that the basolateral and central amygdala are responsible for learning about distinct aspects of a reinforcing event. For example, the basolateral amygdala is essential for distinguishing and choosing between specific rewards based on the specific‐sensory properties of those rewards as well as updating the relative value of specific rewarding events. In contrast, the central amygdala is involved in encoding reinforcement more generally and for regulating motivational influences on responding. We will also review what is known about the role of the amyg‐ dala in extinction of reward‐related behaviours and highlight areas for future research.


Introduction
While less is known about the role of the amygdala in reward-related learning compared to its role in fear conditioning where detailed circuitry has been mapped out, research to date nonetheless points to a very interesting and important function for the amygdala and distinct roles for sub-nuclei of this structure. In this review chapter, we will focus on rodent studies (using rats and mice) examining the role of the basolateral (BLA) or central (CeA) amygdala in the formation and expression of both Pavlovian and instrumental associations, the effects of changes in reward magnitude or value on responding, and to changes in reward contingencies to define the role of the amygdala and subnuclei of this structure in learning about reward and control of reward-seeking behaviours.

Basolateral nucleus of the amygdala 2.1. Involvement of BLA in encoding reward expectation in Pavlovian tasks
Neuronal firing in the BLA is elevated in response to reward-predictive CS's, occurring prior to reward delivery. Such activity is thought to drive reward-seeking behaviours. For example, in an odour discrimination paradigm, BLA firing differs following presentation of stimuli that predict a positive outcome (sucrose) versus a negative outcome (quinine), suggesting the BLA is involved in learning about expectancy for consequences of a response [1]. Importantly, this discriminative neural activity precedes reliable behavioural discrimination, suggesting that the neural activity may support learning and behavioural change and is consistent with a role for the BLA in encoding of information about a reward predictive CS. Data regarding a causal role for this activity in Pavlovian learning are, however, more mixed.
A number of studies have found no evidence of any effect of BLA lesions on the acquisition of a Pavlovian response, with lesioned rats demonstrating food cup approach in the presence of a CS+ similar to that of sham rats ( [2][3][4] see also [5]). The acquisition of Pavlovian autoshaping responses was also similar to that of controls [6]. However, other studies indicate BLA lesions or inactivation impairs acquisition of Pavlovian associations. Lesions of the BLA reduce rats' preference for a flavoured solution paired with sucrose compared to sham controls, with no effect on consumption of CS+ and CS− solutions when these were not paired with sucrose, suggesting involvement of the BLA in Pavlovian associations between the flavour of the liquid and the sucrose reward [7]. BLA lesions impair taste-potentiated odour aversion, and infusions of the GABA-A agonist muscimol into the BLA indicate the BLA mediates the acquisition, but not the expression of taste-potentiated odour aversion ( [8], see also Ref. [9]). Also, lesions of the BLA impair the acquisition, but not expression of magazine approach in a task where discrete cues signal the location of a sucrose reward [10]. BLA lesions impair secondorder conditioning [3,11,12]; but this deficit is secondary to the role of the BLA in the assignment of motivational value to the first-order CS+ without which motivational significance cannot then be transferred to the second-order CS+ to produce conditioned responding [12]. Inactivation of the BLA with the NMDA antagonists AP-5 or d,l-2-2-amino-5-phosphonovalerate (D-APV) impairs acquisition, but not expression of Pavlovian conditioned approach for sucrose or taste potentiated odour aversion [13,14], while inhibition of dopamine D 1 receptor activity with SCH-23390 impairs acquisition of Pavlovian discriminative stimulus responding (i.e. approaching the food cup in the presence of a CS+, but not a CS−) [15]. D 1 antagonism has no effect on responding when animals are trained further and tested drug free, suggesting a specific role for BLA D 1 receptors in the performance of Pavlovian discriminative stimulus approach [15]. It appears acquisition of Pavlovian associations rely on a BLA to nucleus accumbens (NAcc) pathway, as optogenetic inactivation of the BLA to NAcc pathway using halorhodopsin impairs acquisition of licking behaviour for sucrose in response to sucrose predictive cues ( [16], see also Ref. [17] for related function of this pathway). When optical stimulation was removed, licking returned to non-stimulation levels, suggesting no long lasting effects of BLA to NAcc inhibition on subsequent task acquisition [16]. Thus, despite some inconsistencies in the literature, it appears that the BLA is required for acquisition, but not for expression of Pavlovian stimulus-reward associations.
There is strong evidence implicating the BLA in the acquisition and expression of conditioned place preference (CPP) for a food reward. Electrolytic and neurotoxic lesions of the lateral amygdala (LA) impair acquisition of CPP for a food reward [18], while BLA lesions or inactivation performed after acquisition of CPP for a food reward impair expression of CPP [19,20], suggesting that BLA is implicated in both the acquisition and expression of Pavlovian place learning. Furthermore, muscarinic receptors in the BLA are required for the consolidation of food CPP, as intra-BLA scopolamine infusions following conditioning sessions impairs consolidation of food CPP [21]. Disconnection of the BLA from the NAcc also impairs expression of sucrose CPP, implicating this pathway in the expression of context-food associations [19]. One study has demonstrated no effect of BLA lesions on place conditioning [10]; however, in this study, discrete cues were used to signal the presence of reward within a y-maze, providing an alternative strategy by which the animals could solve the task and as discussed above, there are a number of studies demonstrating no effect of BLA lesions on Pavlovian learning using a discrete cue. As such, the majority of research suggests the BLA, and its projections to the NAcc are required for the acquisition and expression of CPP.

Involvement of BLA in instrumental learning
The involvement of the BLA in the acquisition and expression of instrumental appetitive learning has been extensively examined, with somewhat mixed results. For example, infusion of the NMDA antagonist AP-5 [22] or the D 1 antagonist SCH23390 [23] into the BLA prior to training has been reported to impair acquisition of lever pressing for sucrose pellets but once learning has occurred, BLA inactivation via AP-5 or SCH-23390 has no effect on the expression of action-outcome contingencies, suggesting involvement of the BLA in task acquisition, but not expression [22,23]. It is important to note that in these studies, performance of the lever-press result produced not only primary reward but also a range of visual cues, e.g. offset of the houselight and onset of a stimulus light followed by sucrose at a 3 second delay. The presence of these stimuli as well as the delay in reward delivery makes it unclear whether the BLA is involved in acquisition of the instrumental response-outcome contingency or these other aspects of the task. Indeed, other studies using a more pure instrumental design where the instrumental response produces reward without the presence of any stimuli or secondary reinforcers report no effect of BLA lesions on acquisition of instrumental responding for a single action-outcome contingency (e.g. lever press delivers food pellets) ( [2,24], but see [25]). Furthermore, lesions of the BLA do not impair acquisition of instrumental responding when two action-outcome contingencies earning distinct outcomes are trained (e.g. lever press delivers sucrose solution, chain pull delivers food pellets) [4,26,27]. In more complex discriminative stimulus tasks, where rats are required to initiate the correct action following stimulus presentation, BLA lesions or inactivation using the combined GABA A (muscimol) and B (baclofen) agonists [28,29] or selective serotonin lesions of the BLA [30] do not impair task acquisition, suggesting that the BLA is not essential for stimulus-response learning despite its role in stimulus-reward learning. In contrast, BLA inactivation with muscimol or baclofen impairs expression of previously trained discriminative stimulus responding, suggesting that the BLA may contribute to task expression in a discriminative stimulus task but that when the BLA is inactivated during acquisition, the rats can solve the task, perhaps by using a different strategy [17].

Involvement of BLA in detecting changes in reward-predictive nature of an action
In accordance with a role for the BLA in predicting reward, the BLA is also involved in detecting changes between an action and a rewarding outcome. This has been demonstrated through contingency degradation paradigms, in which the association between an action and its expected outcome is reduced through non-contingent presentations of the reward. Lesions of the BLA performed prior to behavioural training impair contingency degradation, under conditions of extinction (when there is no opportunity to update action-outcome contingencies) and also under conditions of partial reinforcement [26]. BLA lesions also impair contingency degradation when lesions are performed after acquisition of action-outcome contingencies, ruling out any potential learning impairment which could impact the contingency degradation [31]. These studies suggest BLA involvement in detecting changes in the association between an action and expected reward.

Involvement of BLA in updating the value of an outcome
The BLA appears critical for generating internal representations of a reinforcer to guide choice. For example, BLA neurons are sensitive to reward magnitude. Rats in an eight arm radial maze exhibit differential BLA firing to rewards of high and low magnitude [32]. Also, BLA activity is altered in response to changes in expected reward magnitude (i.e. upshift or downshift in reward magnitude) [33]. However, once this contingency is learnt, BLA activity decreases, suggesting involvement of the BLA in the acquisition, but not expression, of reward magnitude changes [33]. Noradrenaline (NOR) is released in the BLA following an increase in the number of sucrose pellets delivered in an instrumental task, suggesting NOR in the BLA contributes to signalling changes in reward value [34].
In addition to changes in magnitude, the value of rewarding outcomes can change based on changes in the animal's motivational state. Involvement of the BLA in updating the value of an outcome is demonstrated through incentive-learning tasks. In these tasks, the value of a reinforcer is updated based on changes in internal states (e.g. hunger, satiety) and this information used to control goal-directed responding. In an incentive learning paradigm, the experience of food consumption in a deprived state subsequently drives responding for that food in a future state of deprivation [35,36]. Importantly, without the opportunity for direct contact with the reward in a novel motivational state, instrumental responding for that reward is not altered even when animals experience motivational state changes at test, e.g., responding is not increased despite an increase in hunger until the animal experiences that food while hungry. Consolidation and reconsolidation of this form of learning are blocked by intra-BLA infusions of the protein synthesis inhibitor anisomycin [37]. Infusions of the non-selective opioid antagonist naloxone into the BLA also impair acquisition but not retrieval of incentive learning [36]. This effect appears dependent on μ-opioid receptors, as infusion of a μ-receptor antagonist blocks acquisition of positive incentive learning (i.e. under conditions which enhance reinforcer value), and infusion of a μ-receptor agonist impairs negative incentive learning (i.e. under conditions which reduce reinforcer value) [38]. K and δ antagonists have no effect on positive or negative incentive learning, suggesting a specific role for μ-receptors in the BLA in incentive motivational processes [38].
Related results have been found in outcome devaluation studies. In this task, hungry rats are, for example, trained to perform two distinct instrumental responses each earning a unique food outcome. Rats are then pre-fed one of these outcomes to satiety prior to a choice test where the two responses are available but no outcomes are delivered. BLA lesions or inactivation impairs sensitivity to outcome devaluation [4,6,26,31,39,40]. This effect is observed when lesions are conducted both prior to instrumental learning, and after acquisition of instrumental learning [26,31], indicating any potential effects of lesions on action-outcome learning do not contribute to the loss of sensitivity to outcome devaluation. Also, BLA inactivation prior to the pre-feeding treatment, but not after pre-feeding but before the lever test impairs outcome devaluation, suggesting the BLA updates reinforcer value to guide choice, but that once reinforcer value has been updated, the BLA is no longer needed ( [41], see also Ref. [6]). Such results have been taken as evidence that the BLA associates the specific sensory features of stimuli with motivational significance and updates this association as needed. This information can then be used to guide choice in outcome devaluation and related paradigms [26].
There are some reports of BLA lesions having no effect on outcome devaluation; however, it appears these results are due to differences in experimental parameters. When rats are trained in a Pavlovian magazine approach paradigm with a single reinforcer and devalued using lithium chloride (LiCl), BLA lesions do not impair outcome devaluation [3,42]. There are a number of variables which could contribute to this apparent discrepancy, namely the method of training (Pavlovian or instrumental), the number of reinforcers used (one or two reinforcers) and the method of devaluation (LiCl or specific satiety). Johnson and colleagues [39] assessed the contribution of the method of training and method of devaluation to establish how these factors may help understand what aspect of learning requires BLA involvement. All rats were lesioned following training, to isolate the effect of the lesion to devaluation. All rats were trained with two reinforcers. Four groups of rats were used: rats were either given Pavlovian or instrumental training, and devalued via LiCl or specific satiety, creating the following four groups: Pavlovian-LiCl, Pavlovian-specific satiety, instrumental-LiCl and instrumental-specific satiety. BLA lesions impaired outcome devaluation in all four groups. As the only variable which was not manipulated was the number of reinforcers trained (in this case, two), this led the authors to suggest the number of reinforcers used may mediate whether the BLA is required for outcome devaluation. Indeed, if the BLA encodes sensory representations of a stimulus and associations of these with value, then successful devaluation performance may depend on the ability to generate sufficiently detailed outcome representations so that performance specifically related to the currently devalued outcome, but not other possible outcomes, being specifically affected. Thus, in the case of two reinforcers, the BLA is required to generate this specific representation so that animals can then directly respond to the two reinforcers based on specific sensory features in order to guide action selection as the motivational significance is largely overlapping. However, when only one reinforcer is present, no discrimination between reinforcers nor any association of value with the sensory features of the outcome is required, which may leave outcome devaluation intact when the BLA is offline ( [39], for discussion, see Refs. [31,43]).

The BLA, reward prediction and stimulus influences on responding
Despite some inconsistencies in the effects of BLA lesions or inactivation in the acquisition of instrumental responding, incentive learning and outcome devaluation tasks suggest that the BLA is important for assigning motivational significance to outcomes based on their specific sensory features. There is strong evidence to suggest that the BLA is important for assigning reward value to actions and external stimuli more generally. Early reports demonstrate that in an eight arm radial maze, BLA neurons exhibit enhanced firing to an anticipated reward encounter, implicating BLA activity in predicting rewarding events [32]. In operant tasks requiring rats to nose poke for sucrose, BLA firing is enhanced during reward expectation, but decreases when animals no longer anticipate reward delivery following their actions under behavioural extinction [44]. Also, LA neurons respond to reward predictive cues, and activity in LA neurons is associated with task efficiency and accuracy, as well as increased synaptic strength [45]. Finally, neurons activated in response to a discriminative stimulus fire in the BLA prior to the NAcc, suggesting the BLA drives NAcc neuronal responses to reward predictive cues to promote reward-seeking behaviour [17].
Following detection of reward-predictive stimuli, it appears glutamate transients in the BLA are involved in initiating reward-seeking action. Glutamate transients in the BLA are enhanced during a seeking-taking chain task for sucrose pellets, and glutamate transients tend to precede lever responses on both the distal lever (i.e. lever responses which gave access to the proximal lever) and proximal lever (i.e. lever responses on which are rewarded with sucrose) [46]. Furthermore, in a simple instrumental task, BLA glutamate transients are more likely to be associated with initiating the pressing bout, than with reward or non-reward earning lever presses [47]. These data suggest glutamate signalling is critical for driving actions which lead to reward.
Experiments using outcome devaluation indicate involvement of the BLA in encoding the sensory specific properties of reinforcers. Further support for this notion is derived from Pavlovian-instrumental transfer (PIT) experiments, where presentation of a CS previously paired with a reinforcer drives instrumental responding for the same reinforcer, despite the CS and instrumental response having never been trained together before. Importantly, when rats are trained with two reinforcers (i.e. two CSs are paired with two distinct rewards and two instrumental responses earn those same two rewards), responding during PIT can be identified as 'specific', with increased instrumental responding on the lever that, in training, delivered the same outcome as that predicted by the stimulus. In contrast, 'general' PIT is an elevation in responding that does not rely on a common outcome in instrumental and Pavlovian training phases (for a discussion, see [4,48] #321). When rats are trained with mul-tiple rewards, BLA lesions impair specific, but not general PIT [4,27], suggesting the BLA is required for distinct stimuli to direct instrumental responding. Furthermore, blocking AMPA, but not NMDA receptors in the BLA inhibits PIT [47], and BLA glutamate transient frequency correlates with instrumental responding during the CS, which was trained with the same outcome, but not the different outcome [47]. Importantly, BLA glutamate transient frequency is enhanced during initiation of lever pressing, suggesting BLA engagement following detection of reward-predictive stimuli which initiates goal-directed responding [47]. It appears the involvement of the BLA in PIT is dependent on the number of reinforcers (stimuli and responses) trained, because when rats are trained with only one reinforcer, BLA lesions have no effect on PIT [2,49]. When two reinforcers are trained, specific sensory properties need to be utilised to permit discrimination; this requires the BLA and is impaired following BLA lesions or inhibition. However, when only one reinforcer is trained, it is not necessary to distinguish between reinforcers via their sensory properties to direct responding; thus, this behaviour does not require the BLA and is therefore unimpaired following BLA lesions.
Consistent with PIT studies, BLA lesions abolish outcome-guided responding in an outcomespecific reinstatement task, in which outcome presentation selectively increases performance of a response previously associated with the same, but not a different outcome as that which was just presented [31], supporting the involvement of the BLA in the representation of sensory-specific properties of stimuli and integration of those stimulus properties with motivational significance to direct choice behaviour.

The BLA mediates risky and effortful decision making
Experiments using delay discounting paradigms indicate the BLA guides choice towards high effort/high reward options. When rats are required to choose between high effort/high reward versus low effort/low reward options in a T-maze, BLA lesions reduce choice for the high effort/high reward option [50,51]. Similarly, when a high reward choice requires a longer delay, as in delay discounting paradigms, BLA lesions also bias choice to smaller, immediate reward [52,53]. There can be some recovery from bias towards low effort/low reward options, suggesting the BLA is involved in the acquisition of the value of a reward in an effortful task [51]. Disconnection of the BLA from the medial prefrontal cortex or anterior cingulate cortex (ACC) also biases choice to smaller, immediate rewards ( [50,52] but see Ref. [54]), consistent with a role for these structures in effort-based decision making [55].
Paradigms involving risky decision making indicate the BLA also guides choice towards high risk/high reward options. BLA inactivation with baclofen or muscimol induces a risk-averse pattern of choice and, in a similar paradigm, reduces high effort choice, irrespective of the delay to reward [56]. It is possible that the BLA directs responding toward risk when loss is involved, as rats with BLA lesions bias their behaviour away from risk when loss was a consequence of a high risk choice, but do not alter their behaviour when potential gains are available [57]. BLA lesions or inactivation does not alter choice when two rewards are equal, or there is no risk involved [56][57][58], suggesting a particular role for the BLA in biasing choice in the face of aversive consequences. It appears a BLA-NAcc connection mediates BLA-induced biasing of choice, as contralateral lesions of these structures biases choice toward a less risky option [54]. Some studies demonstrate BLA lesions enhance, rather than decrease risky decision making in a rodent gambling task [59], or when foot shock is used as punishment (instead of reward omission [58]). However, a recent study indicates individual differences may help explain these results. BLA inactivation can affect animals differently at an individual level-BLA inactivation increases effortful choice in rats which, at baseline, chose low effort/low reward options and BLA inactivation decreases effortful choice in rats which, at baseline, chose high effort/high reward options [60]. Furthermore, this biasing of choice appears dependent on BLA dopamine receptors. In risk-averse rats, D 1 agonist infusions into the BLA increase risky choice, whereas in risk-prone rats, enhancing D 1 activity reduces risky choice [61]. Also, infusions of the D 2 agonist quinpirole reduce risky choice in risk-prone rats [61]. It is possible dopamine receptors in the BLA mediate the interaction between costs and benefits in a task to generate subjective value which could differ between individuals or across experiments. Approaching the BLA as a mediator of decision making based on a cost/benefit analysis may explain why some studies report an increase in risky decision making following BLA inactivation-the effects of inactivation of this structure on behaviour may be dependent on task parameters which can bias decision making in a certain direction.

No consistent involvement of the BLA in reversal learning
Several studies have examined the role of the BLA in reversal learning; however, the results at present are inconsistent. One study demonstrates inactivation of the BLA with muscimol impairs reversal learning in an odour discrimination task [52]; however, another study demonstrates no effect of BLA lesions on reversal learning in a go/no-go odour task [62]. Interestingly, in this study BLA lesions ameliorated impairments in reversal learning induced by orbitofrontal cortex lesions, suggesting projections between these two regions may control reversal learning [62]. In an operant nose-poking discriminative stimulus task, BLA lesions have been shown to facilitate reversal learning, and limit the number of mistakes made following feedback on an incorrect trial [63]. However, in a similar nose-poking discriminative stimulus paradigm, serotonin depletion in the BLA had no effect on reversal learning [30]. It appears the involvement of the BLA in reversal learning is not dependent on the task employed, as similar tasks (e.g. odour discrimination, operant nose poking) report inconsistent effects of BLA inactivation on reversal learning. Further research in this field is required to more conclusively determine the role of the BLA in reversal learning.

Involvement of the BLA in the appetitive extinction learning
A number of studies demonstrate that the BLA is critical for the acquisition of appetitive extinction learning. Excitotoxic lesions of the BLA enhance resistance to extinction learning when a magazine light and sucrose reinforcer are omitted, indicating BLA lesions impair extinction learning [28]. However, in this study, the use of excitotoxic lesions did not permit analysis of whether BLA lesions impair encoding, consolidation or retrieval of extinction learning [28]. Inactivation of the caudal BLA with bupivacaine (a sodium channel blocker) impairs acquisition, but not retrieval of extinction of instrumental responding, demonstrating BLA involvement of the acquisition of extinction learning [64]. In apparent contrast, intra-BLA infusions of the NMDA partial agonist DCS, which should increase rather than decrease activation of BLA neurons, prior to extinction learning in an odour discrimination task has been reported to impair extinction and enhance responding at a retention session [65]. While a number of studies demonstrate DCS-enhanced extinction learning (reviewed in Refs. [66,67]), it appears the timing of DCS administration is critical in determining whether it enhances or impedes extinction learning (for a discussion, see Ref. [65]). Nonetheless, this study [65] demonstrates that NMDA receptors in the BLA contribute to extinction of appetitive learning. Finally, a subset of BLA neurons respond specifically during extinction of operant nose poking for sucrose; this subset does not respond during task acquisition and activity of these neurons is inversely correlated with responding during extinction [68]. These studies demonstrate a critical role for the BLA in detecting the absence of an expected reinforcer during instrumental appetitive extinction and are in agreement with the role of the BLA in detecting changes in reward value.
There is also some evidence to support a role for BLA signalling in the extinction of Pavlovian appetitive learning. For example, when rats are trained to lick for sucrose in the presence of a combined tone/light CS+, the firing of BLA neurons during extinction correlates strongly with extinction behaviour [69]. Furthermore, a subset of BLA neurons which responded during extinction also respond during reinstatement, suggesting the BLA is a site of plasticity mediating responding for motivationally significant stimuli [69]. These studies suggest that the BLA mediates aspects of Pavlovian appetitive extinction; however, further research in this field is required to determine the precise role of the BLA in appetitive extinction.

The BLA as part of a broader circuit involved in reward-related learning
It is important to recognise that the BLA does not operate in isolation to control learning and performance. Below we highlight some example of interactions of the BLA with other structures. This section is not meant to comprehensive but to provide examples of how the BLA interacts with other brain areas. The BLA has dense projections to the posterior dorsomedial striatum (pDMS), insular cortex (IC) and NAcc [70], which, following detection of a change in reinforcer value in the BLA, mediate aspects of goal-directed responding, such as knowledge of and engagement in action-outcome contingencies. A BLA-IC connection is required to encode and retrieve changes in reinforcer value [71]. BLA inactivation using the NR2B NMDA antagonist ifenprodil prior to specific satiety, but not prior to a choice test impairs outcome devaluation, suggesting BLA involvement in encoding changes in reinforcer value. However, ifenprodil infusions into IC prior to specific satiety or a choice test impair devaluation, suggesting that the IC mediates expression of devaluation. Finally, ifenprodil infused unilaterally into BLA prior to specific satiety and into the IC prior to a choice test blocks expression of devaluation, but ifenprodil infused into the IC prior to specific satiety and then into BLA prior to choice has no effect on expression of devaluation. This suggests the BLA updates and encodes information about reinforcer value during specific satiety, sending information to the IC prior to choice, and at test, the IC retrieves this information to guide choice between actions [71]. Also, connections between the BLA and posterior dorsomedial striatum (pDMS) are required to direct action-outcome responding following a change in reinforcer value. The pDMS is critical for updating action-outcome contingencies, as changes in response-outcome associations are impaired following pDMS lesions [72,73]. It appears the pDMS is required to retrieve action-outcome associations following a change in reinforcer value, as unilateral lesions of the BLA coupled with inactivation of contralateral pDMS prior to a choice test impairs expression of outcome devaluation [73]. This suggests information from the BLA regarding the specific value of outcomes is transferred to the pDMS, which retrieves action-outcome associations to guide instrumental performance [73].
Finally, a connection between the BLA and NAcc shell is necessary for action selection following reinforcer devaluation. Disconnection of the BLA and NAcc via contralateral excitotoxic lesions impairs outcome devaluation, without reducing overall responding [74]. It is possible the BLA conveys sensory-specific outcome information and/or changes in reinforcer value to the NAcc, where it is used to direct outcome-appropriate instrumental responding [74]. Previous reports demonstrate that NAcc shell lesions impair the ability for action-outcome cues to bias action selection in Pavlovian-instrumental transfer [75], supporting the NAcc shell being a limbic-motor interface structure [76]. Thus, it appears sensory-specific information from the BLA is used to drive action selection in the NAcc shell, which can direct actions through motor output structures such as the ventral pallidum and medial dorsal thalamus.

The CeA in Pavlovian learning
The CeA is involved in conditioning with both appetitive and aversive reinforcement [77], and one proposed role for the CeA is determining the valence of reinforcing events. For example, c-fos immunoreactivity is increased in the medial CeA following exposure to a CS+ signalling food delivery, compared to a CS which did not signal food delivery [78]. CeA c-fos immunoreactivity is also increased following exposure to a CS+ signalling foot shock, particularly in ventral regions of the structure, suggesting sub-regions of the CeA may detect the valence of a CS [78]. We focus here on data related to appetitive learning.
When a visual or auditory CS is paired with food, rodents can acquire distinctive behaviours to CS presentation; they may orient themselves to the CS, either by approaching or rearing to a light or startling in response to a tone (orienting responses) or approach the site of food delivery, usually a food cup or magazine (conditioned approach). There is considerable evidence to support a role for the CeA in conditioned orienting responses to a Pavlovian CS+, but not for conditioned approach. Additionally, CeA lesions do not impair second-order Pavlovian conditioned approach [3]. Lesions of the CeA prior to training impair the acquisition of conditioned orienting responses (e.g. rearing to a light), but leave conditioned approach intact [79][80][81].
Similarly, inactivation of the CeA with the AMPA antagonist NBQX impairs acquisition of orienting responses [80]. CeA lesions or inactivation after Pavlovian training have no effect on the expression of Pavlovian orienting responses or food cup approach, suggesting a role for the CeA in the acquisition, rather than expression of orienting responses [80]. While some studies report no effect of CeA lesions on Pavlovian learning, these studies have only assessed conditioned approach behaviour ( [2,82] see also [7]), supporting a dissociation between conditioned approach and conditioned orienting responses in the CeA.
The CeA may be involved in conditioned approach behaviour when rats are trained to approach the magazine following the presentation of a CS+, but not a CS−, and a discrimination score is created which depicts their approach following one CS presentation over another CS. In Pavlovian approach paradigms, each CS+ is associated with a reinforcer, but not with the absence of reinforcement. CeA lesions or intra-CeA inactivation of D 1 or D 3 receptors reduces conditioned approach behaviour [15,83,84]. If the CeA is involved in discriminating positive or negative reward value (discussed below), CeA inactivation may impair this discrimination, leading to a lower discrimination score. Supporting this interpretation, Andrzejewski and colleagues reported equal nose poking rates between the CS+ and CS−, rather than an abolition of nose poking [15], which would support lack of discrimination between the two CSs but not an inhibition of nose poking following CeA inactivation. It is also possible that these effects relate specifically to dopamine function within the CeA.

Circuitry mediating conditioned orienting responses
In rats injected with fluorogold, a retrograde tracer, into the substantia nigra pars compacta (SNc) there was a greater number of c-fos positive/fluorogold positive cells in the CeA following food-tone pairings than unpaired food and tone presentations, implicating this pathway in conditioning [85]. Furthermore, contralateral lesions to disconnect the CeA and SNc impair orienting responses but not food cup approach, compared to an ipsilateral lesion control group [85]. Considering that the CeA has a substantial projection to the SNc that provides dopaminergic innervation to the dorsolateral straitum (DLS) [86,87], it is possible that a CeA-SNc-DLS pathway mediates orienting responses to Pavlovian food CS's. Evidence for this comes from the demonstration that unilateral lesions of the CeA coupled with dopamine depletion in the DLS in the opposing hemisphere impairs conditioned orienting responses for food pellets, while leaving food cup approach behaviour intact [88]. Similar results were obtained when the DLS was reversibly inactivated with lidocaine [88]. Recovery of conditioned orienting responses occurred on drug free days in rats previously treated with intra-DLS lidocaine, suggesting no long lasting effects of CeA-DLS inactivation on acquisition of conditioned orienting [88]. Together, these results suggest orienting responses to a Pavlovian cue are mediated by indirect connections between the CeA and the DLS likely via the SNc.

The CeA in instrumental learning
The CeA does not appear to be critical for the acquisition of instrumental action-outcome contingencies. Lesions of the CeA do not impair instrumental learning when there is a single actionoutcome contingency (i.e. one lever, one reinforcer) [2,82] or two action-outcome contingencies (i.e. two levers, two reinforcers) [4]. There is some evidence that the CeA is involved in updating action-outcome contingencies. The omission of an expected reward at test enhances c-fos immunoreactivity in the CeA, suggesting CeA detects the lack of reward [89]. Furthermore, fluorogold injections in the SNc demonstrate these c-fos positive CeA cells project to the SNc [89], implicating a CeA-SNc pathway in the detection of changes in reward contingencies. Lesions of the CeA produce a mild impairment in performance when an expected reward of small magnitude is omitted [90]; however, lesions of the entire CeA and BLA combined substantially reduce sensitivity to omission and so the specific contribution of the CeA is somewhat unclear [91].
CeA involvement in the detection of changes in reward value appears to depend on the paradigm used to assess this change. In studies using the outcome devaluation task, where an outcome is devalued either with selective satiety or LiCl-induced sickness, CeA lesions do not impair behavioural sensitivity to changes in reward value [3,82], indicating no role in this evaluative process and further substantiating intact action-outcome learning necessary for performance in this task. Of interest, CeA lesions prevent loss of sensitivity to outcome devaluation that typically occurs with over-training, suggesting a role for the CeA in habitual behaviour [82]. Similar effects are observed following disconnection of the CeA from the DLS, produced by contralateral lesions of these structures, suggesting that the CeA sends a reinforcement signal to the DLS to strengthen the stimulus-response (S-R) association that is thought to underlie habit learning [82].
While the CeA is not necessary for normal sensitivity to devaluation, it is involved in learning about changes in the magnitude of reward. When rats are trained to run in a straight alley maze task for a large food reward, a downward shift in the magnitude of the food reward increases the latency of intact rats to reach the smaller reward [92]. Post-shift lidocaine infusions into the amygdala, which were mostly aimed at the CeA, reduce the latency to reach a smaller reward, suggesting reduced sensitivity to the change in reward magnitude [92]. Similarly, pre-training CeA lesions slow learning about a downward shift in reward magnitude in a straight alley maze, supporting a role for the CeA in detecting reward magnitude changes [93,94]. More recently, optogenetic stimulation of CeA with channelrhodopsin was shown to enhance lever pressing for a sucrose pellet when both the delivery and consumption of this pellet were paired with laser stimulation, compared to delivery of sucrose pellets alone, suggesting CeA stimulation may enhance the perceived magnitude of a reward [95]. Finally, the μ-opioid agonist DAMGO administered within the CeA enhances sniffing and nibbling at a food cup or reward predictive lever, suggesting enhanced reward value attributed to these stimuli following CeA μ-opioid stimulation [96]. Collectively, these studies implicate the CeA in processing changes in reward magnitude. Performance may be spared where tasks rely on discrimination and choice between rewards based their relative value and distinguished using specific sensory properties. Such tasks rely instead on the BLA as described above. Together these findings are consistent with the idea that whereas the BLA is responsible for assigning and updating the value of specific outcomes based on their sensory properties, the CeA is responsible for a less specific reinforcement signal, accounting for its role in both habit learning and adjusting performance following changes in reward magnitude [43,82].

CeA involvement in stimulus influences on instrumental responding
The CeA plays a role in signalling the general motivational information carried by stimuli but consistent with the studies above, not in detailed representation of the specific features of distinct rewards or their representation. Evidence for this comes from Pavlovian-instrumental transfer (PIT) tasks, which assess control of instrumental responding by Pavlovian cues, despite the two types of training being conducted separately. PIT occurs when presentations of a CS (previously paired with a US) drives instrumental responding which was previously trained to obtain the same US. The involvement of the CeA in PIT is dependent on the type of PIT being examined. When rats are trained on one instrumental action-outcome contingency and undergo Pavlovian training which involves one CS-US association, CeA lesions impair PIT [2,48,82], while intra-CeA infusion of the μ-opioid agonist DAMGO enhance PIT [97]. However, when rats are trained on two instrumental action-outcome contingencies and two Pavlovian CS-US pairings, lesions of the CeA have no effect the outcome-specific PIT that is generated by this type of training [4]. Importantly, in an experimental design where a third excitatory CS+ is introduced in the Pavlovian training phase and paired with a third reward not earned by either instrumental response, CeA lesions impair responding to the third CS, but leave outcome-specific PIT intact, suggesting the CeA is involved in general appetitive arousal rather than directing outcome-specific responding [4]. This suggestion also accounts for the experiments described above which use only one action-outcome contingency and one CS-US pairing; when there is no choice between CS driven responses, a reduction in general appetitive motivation reduces responding in general. These findings suggest the CeA encodes a reinforcement signal which is devoid of specific details about an outcome.

Some involvement of the CeA extinction of appetitive learning
Several electrophysiological studies implicate the CeA in the extinction of appetitive learning. For example, Toyomitsu et al. [69] recorded from neurons in the BLA, LA and CeA during extinction of Pavlovian licking for sucrose reward. CeA firing during extinction correlated with extinction of licking; however, this correlation was not as strong as BLA firing [69]. Furthermore, while there were changes in the firing rate of CeA neurons between extinction and reinstatement, this was not as pronounced as in BLA neurons [69]. Calu et al. [98] recorded from the CeA during an over-expectation task where initially multiple stimuli are trained as independent predictors of reward (e.g. individual stimuli such as a tone, light, etc., each predict a food pellet). The critical manipulation comes when two or more of these stimuli are then presented together, as a compound. Animals typically increase responding to such a compound indicating that they expect more reward based on the multiple predictors (e.g. since tone and light alone previously predicted one pellet, the two stimuli together should predict two pellets). However, if this compound is followed only by the original reward (one pellet) behavioural responding decreases across trials, as does activity of the CeA [98]. Together with data from extinction paradigms, this suggests that the CeA may signal reward reduction in general, not just reward omission. A recent study by Iordanova et al. [99] examined this possibility by exploring the role of the CeA in updating reward expectancies following a reduction in reward achieved either through extinction, where reward was omitted entirely, or through generating over-expectation where, due to the presence of multiple predictors, a large reward is expected but not received. In both paradigms, the majority of recorded cells showed an increase in firing during the period where food reward was delivered and also during the preceding stimulus presentations reflecting reward expectancy. Neural firing to the extinction stimulus was reduced across trials compared to a control stimulus. When a combination of previously rewarded stimuli was introduced to generate over-expectation, neural firing to this The Contribution of the Amygdala to Reward-Related Learning and Extinction http://dx.doi.org/10.5772/67831 compound was increased relative to a control compound in early trials but then equivalent in later trials, presumably as animals came to expect the reduced reward that was received. A subpopulation of the reward-responsive cells showed a reduction in firing to both the extinction and over expectation trials, suggesting a common role in signalling reduced reward expectation. Importantly, this change in neural activity preceded and predicted the decline in behavioural responding observed under both extinction and over-expectation conditions. Because these conditions involved the delivery of different amounts of reward (no reward in extinction whereas reward was still delivered in over-expectation, albeit less than initially expected based on the stimuli) the similar changes in neural activity are unlikely to reflect absolute reward magnitude, but rather may signal the reduction in reward expectancy. It is possible that this reduction in reward creates an aversive motivational state, in which case these findings could be consistent with a more general role for the CeA in emotional learning [99].

Conclusion
While less is known about the role of the amygdala in reward-related learning compared to its role in fear conditioning where detailed circuitry has been mapped out, research to date nonetheless points to a very interesting and important function for the amygdala. For example, the basolateral amygdala is involved in associating sensory-specific aspects of different outcomes with the rewarding effects of that outcome, a function critical for choice between alternatives and behavioural control more generally. Further, the amygdala appears to be involved in updating representations of value both when the value of the outcome is changed, for example, following devaluation, or when the relationship between predictors and outcome delivery is changed, as in extinction. Thus, the amygdala plays an important role in reward-related learning. With the advent of tools such as optogenetics, researchers can now go on to explore how these functions are achieved within the complex circuitry of the amygdala and associated structures.