Attractor Hypothesis of Associative Cortex: Insights from a Biophysically Detailed Network Model

Ever since Hebb proposed that cells that fire together wire together, the idea that memories are formed by distributed cell assemblies capable of self-sustained activity [1] has been one of the main hypothesis regarding memory formation and recall. It has laid the foundation for a theory of attractor memory extensively exploited in computational neuroscience. Memory representations, manifested as the selective activations of these cell assemblies, serve as attractors in simulated neural networks and can be retrieved as a result of external stimulation or intrinsic system dynamics.


Introduction
Ever since Hebb proposed that cells that fire together wire together, the idea that memories are formed by distributed cell assemblies capable of self-sustained activity [1] has been one of the main hypothesis regarding memory formation and recall.It has laid the foundation for a theory of attractor memory extensively exploited in computational neuroscience.Memory representations, manifested as the selective activations of these cell assemblies, serve as attractors in simulated neural networks and can be retrieved as a result of external stimulation or intrinsic system dynamics.
Despite major efforts in neuroscience to investigate the attractor hypothesis experimentally, which have produced some supporting evidence, no conclusive result to prove or reject it has been provided.This current status can largely be attributed to the limitations in data collection and the distributed nature of Hebbian cell assemblies.For the attractor hypothesis of associative cortex to be validated, simultaneous spiking data from a vast number of cells over a large spatial scale should be recorded.In slices [2] and cell cultures [3], more accessible for such recordings, evidence for cell assemblies capable of self-sustained activity has been provided.In vivo however the task is more challenging since the use of intrusive techniques is limited.In addition, activity related to attractor dynamics can be obscured by spiking contributions reflecting other, parallel processes in behaving animals.In consequence, we must at this point rely on indirect evidence.Simulations in biophysically detailed attractor networks can provide useful insights in this regard and help to address questions relevant to a hypothesis of attractor computations in cortical circuits, for example: • Is the known cortical connectivity with relatively sparse cell-to-cell connectivity sufficient to support the globally coherent phase-transitions and sustained activity states associated with attractor networks?
• Likewise, is the observed sparse and low rate cortical activity consistent with the activation of recurrently connected cell assemblies?
• What features of the neural activity in vivo could be linked to and interpreted in light of the simulated attractor dynamics?
• In what capacity can additional phenomena, such as oscillations in various frequency bands and cross-frequency coupling effects, be explained by the presence of attractors in the biological system?
In addition, models compliant with known biological data can then make several testable predictions and guide further experimental work.The last few decades, attractor networks have been used extensively as models for cortical memory in various paradigms [4][5][6][7][8][9][10][11][12][13][14][15].The major distinguishing feature of the model presented here is that it operates in an oscillatory regime and has a modular structure [16][17][18][19][20]. Throughout this chapter we demonstrate evidence for biological relevance of these features and motivate functional advantages of oscillations in our attractor network.

Basic hypothesis
The ad hoc hypothesis adopted here is that layers 2/3 of associative cortex provide the neural substrate for attractor memory network.In the light of attractor hypothesis, cortical memory representations correspond to attractor states supported by recurrent excitatory connections.
Attractor networks have several dynamical attractors, to which similar activity patterns in terms of a combination of specific active and inactive units are attracted.These attractors can be stored by means of synaptic learning.The attractor dynamics lends the memory system several attractive features.First of all, such memory networks are noise resistant and fault tolerant in the sense that a noisy, corrupted or incomplete stimulus can still activate a full corresponding memory pattern -the effect known as pattern completion.Furthermore, when conflicting stimuli are provided the phenomenon of pattern rivalry occurs.In addition, the use of local, synaptic learning rules are sufficient to form global memory patterns using highly parallel processing.Despite this locality, an attractor network trained with a Bayesian-Hebbian learning rule [21] retrieves the pattern provided with the stronger evidence based on the statistics of the input and previous learning examples.In addition, storage capacity in largescale attractor networks appears to meet biological needs [22].
Despite a high degree of compatibility between the functionality of attractor networks and that of cortical memory, it is relevant to study the actual anatomical substrate of attractor dynamics in cortex.As mentioned in the beginning, we hypothetically designate layer 2/3 to be the main driver of such dynamics; mostly due to the predominant presence of dense recurrent connections, necessary to support attractor function.From a neurodynamical perspective, these layers seem to be the main source of excitatory drive in the cortical circuitry [23,24].In addition, the phylogenetically oldest parts of the cerebral cortex only contain the superficial layers so if the attractor functionality is central to cortical processing, it should be harbored there.The deeper layers, which emerged later in evolution, could still be directly involved in or supporting attractor function.Then however they would be likely to rather address the needs arising from the expanding cortex size such as readout and output to subcortical structures [25,26], or participate in the selection and modulation of both task-relevant and task-irrelevant cortical modalities.The latter notion is bolstered by the fact that layer 5 seems critical in the regulation of cortical up and down states [27,28], i.e. in the regulation of global excitability of entire cortical areas.
In further support of attractor dynamics in the superficial layers, stimulus evoked neural activity exhibited in layers 2/3 is also sparser with lower average firing rate and is more selective to input statistics compared to the deeper layers [29,30].These characteristics are congruent with sparse and distributed memory patterns stored in attractor networks.However, a consequence of this relatively sparse activity is that it is likely to be obscured by deep layer activity when large quantities of spiking data is collected, which hinders the acquisition of direct neural evidence for attractor-like dynamics.It is not surprising therefore that the most direct in vivo evidence of attractor dynamics comes from olfactory cortex ( [31,32] and references found therein), and hippocampus [12,33,34], i.e. cortical structures that lack the deeper layers.There is also evidence for self-sustained and input specific activity from inferotemporal [35,36] and prefrontal cortex [37][38][39], which are late in the processing stream and therefore should be more strongly influenced by the intrinsic connectivity.In addition, two-photon calcium imaging studies have produced relevant insights into the attractor hypothesis since the imaging method can reveal calcium current traces with good temporal resolution in tens to hundreds of neurons simultaneously within a small cortical volume of the superficial layers in vivo [40][41][42][43].This technique was recently used to demonstrate nonlinear attractor-like activity in auditory cortex [42].In particular, spatially organized neuronal sub-groups were shown to respond discretely in time to specific auditory cortex input [42].
Here, groups of stimuli evoked all-or-nothing responses in distinct neural sub-groups.These discrete activities were however partly obscured by a large trial-to-trial variability.
Finally, there is evidence for attractor dynamics sustained by the recurrent connectivity in striate cortex [44,45].Using voltage-sensitive dye imaging, Kenet et al. [45] found that the superficial layers switched spontaneously and in a coordinated fashion between re-occurring states spanning several cortical columns.These spontaneous states showed strong correlation to visually evoked patterns of activity and have later also been reported to match the structured, horizontal long-range connections in layer 2/3 [46].It thus seem likely that visually evoked states are strongly related to self-sustained attractor states supported by recurrent connectivity in superficial layers.
However, it is not clear whether such switches between stable activity patterns are indeed compatible with the dynamics of computational networks as for such models, unlike biology, full connectivity between units is often used.Further, single units in attractor networks display very high firing rates with low variability while superficial activity in vivo has low rate and is highly variable.From the modeling perspective, the implications of the questionable assumption about all-to-all cortical connectivity adopted in theoretical studies (mathematically it ensures convergence to stable states) have hardly been investigated in the context of biological plausibility of attractor dynamics and function.Nor have the very low firing rates reported in vivo been reproduced.Our approach relying on a biophysically detailed attractor network model of cortex with a spatial scale spanning several hypercolumns [16], which draws from known anatomy and connectivity, allows for addressing some of these questions.

The network model
The network contains two types of neurons, excitatory pyramidal and inhibitory basket cells, composed of several compartments modeled by Hodgkin-Huxley equations.The basic functional units of the network are however minicolumns, each containing 30 recurrently connected pyramidal cells (Figure 1), inspired by the columnar structure of sensory cortex [41,[47][48][49].These should not necessarily be seen as anatomical columns but rather functional columns consisting of subgroups of more tightly connected neurons, as found throughout cortex [40,42,43,[50][51][52][53][54][55].A cluster of minicolumns, spanning a few hundred microns, constitutes a hypercolumn in the network.Since the minicolumns within each cluster are coupled through a pool of basket cells, a hypercolumn can be defined by the extent of non-specific feedback inhibition [52] (Figure 1).In earlier studies [16] we used down-scaled hypercolumns containing 8 minicolumns, but in the subsequent work hypercolumns contained at least 49 minicolumns [17][18][19][20].The feedback from the basket cells has several functions.It normalizes activity in the network, provides the means for mutual competition that implements winner-take-all (WTA) dynamics within a hypercolumn and finally produce oscillations, which in turn add several interesting dynamical features to the network.Similar local WTA dynamics, on the scale of ~200 microns, was recently observed in auditory cortex in vivo [42].
We have typically modeled a cortical patch of about 1.5x1.5 mm using a 9-hypercolumn network.Distributed, retrievable and sparse patterns of activity are stored as attractors in this network.This is achieved by long-range interactions between pyramidal cells in minicolumns across different hypercolumns (Figure 1).Such structured, horizontal connections had originally been adopted in the model as an assumption but later on they received increased experimental support from studies of layer 2/3 connectivity [46,56,57].In the work presented here, only orthogonal attractor patterns are stored, i.e. each minicolumn only participates in one global pattern.Although overlapping patterns, where each minicolumn participates in several patterns, increase memory storage capacity, they lead to similar results [58].Data from in vivo paired recordings are used to bring connectivity and synaptic weights as close to biology as possible [59], but assumptions regarding long-rage connectivity have to be made.

Attractor properties, low firing rates and nested oscillations
We have found that stable attractor activity can indeed be maintained for plausible synaptic weights and very low firing rates, if the network is operating in an oscillatory regime (Figure 2).These oscillations, in the range of 25-40 Hz, correspond to upper beta [60] and gamma-like [61] oscillations in vivo, which have been correlated with active stimulus processing and memory recall [60][61][62][63][64][65][66][67][68].In our network, the oscillations are generated by the strong feedback inhibition from basket cells (pyramidal interneuron gamma (PING) network; [69,70]).This feedback inhibition also effectively underlies the selection of a winning population in the WTA circuit within a hypercolumn and controls firing rates in this winning cell assembly.
The oscillatory regime is also interesting for other computational reasons.Due to the gammacycle dynamics, an attractor cell assembly could maintain its activity and suppress the activity of competing assemblies already at an average firing rate of 3 s -1 per pyramidal cell [17].This can be explained by the dynamics of the gamma cycle, which has a phase dominated by excitation where pyramidal cells have an opportunity to fire, and followed by a phase where the innervated basket cells shut down the activity in the network.As this inhibitions wears off, there is a race between populations of pyramidal cells to reach the firing threshold before recruited basket cells shut down the activity again [65].As a result, only a small bias (low firing rates) to one of the competing populations is needed to activate or maintain a given attractor.
Since the network is highly dependent on the activity in the distant recurrently connected hypercolumns an intrinsic bias is mediated by long-range excitation, which arrives out of phase with respect to local excitatory inputs (Figure 2A), often in the inhibition-dominated part of the local gamma rhythm.This reflects an integration of global evidence for a given memory pattern on the gamma time-scale and implies that the resulting decision to either maintain current activity pattern in the network or not is made within a short temporal window in each gamma cycle, serving as a discrete time unit.Consequently, transitions from and to active attractor states are globally sharp.The inter-hypercolumnar connections underlying these computations in a modular network help to stabilize the oscillatory regime since the global excitation arrives out of phase with respect to the local firing [17].Functional Brain Mapping and the Endeavor to Understand the Working Brain The fast state transitions in the network can also be understood from the perspective of balanced excitation and inhibition [11,71,72].Since spiking of individual neurons in balanced networks is driven by input fluctuations rather than average net excitation, with membrane potential close to the firing threshold, rapid state transitions can occur [71,72].The balance, which also results in highly irregular firing on the single cell level [72], is roughly preserved in a large parametric region of the oscillatory regime [17,69].The model can therefore operate in this regime without a need for fine-tuning or plasticity-induced synaptic changes, which are otherwise necessary in memory networks [9,73].The oscillatory regime thus results in fast transitions and irregular firing [17,69] with a CV 2 close to one (during attractor activation), as often reported in vivo during delay match-to-sample tasks [39, 74,75].
Classical attractor networks remain in an attractor state once they fall into one as long as there is no external input forcing a transition.One of the biological mechanisms that can cause such global transitions out of active attractor states is neural fatigue, implemented in our modeling work by the inclusion of cellular adaptation and synaptic depression in the model [76].
Together they render the attractor lifetime finite and the level of adaptation has a direct effect on the attractor duration (Figure 3A).The dynamics of activation and deactivation of attractors with finite life-times result in an increase in theta/delta-band power of the synthesized local field potentials (LFPs) (Figure 4A).The peak frequency of this rhythm corresponds roughly to the inverse of the attractor's dwell time.In consequence, the co-emerging gamma and theta/ delta rhythms are coupled, i.e. the phase of slower theta/delta wave modulates the amplitude of faster gamma activity (Figure 4B).Such nested oscillations have been widely reported as a neural correlate of various memory paradigms [62,[77][78][79][80][81].Theta oscillations by themselves have also been connected to both encoding, learning and retrieval of memory objects [68,[82][83][84][85][86][87].In addition, theta phase modulations of firing rates observed in vivo [85] can also be found in the model (Figure 4C).
From the functional perspective, the network is capable of memory completion and pattern rivalry (Figure 5).Memory completion was tested by providing the network with partial stimuli of the stored patterns and examining whether full activation of the stored activity pattern was achieved via the lateral long-range connections.This occurs when roughly one third of the minicolumns in a pattern receives brief stimulation (Figure 5A, B).Pattern rivalry reflects the network's ability to resolve ambiguities in the input.When two patterns are simultaneously stimulated and their relative strengths vary, it turns out that small differences between stimuli can have a decisive impact on which pattern is activated and which one is extinguished [16].Lundqvist et al. [16] demonstrated that relative differences in input strength of 25% consistently selected the more strongly stimulated assembly.This is by no means the lower limit though and here we used 10% differences (Figure 5B).Once the activity of the winning pattern is terminated due to adaptation and the same conflicting stimuli is applied again, the weakly stimulated pattern typically gets activated (Figure 5B). .The network will quickly end up in one of its attractor states at the onset of each simulation.At t = t lt (broken lines) neural fatigue and synaptic depression has increased the energy of this attractor such that noise will bump it to another attractor state.If there is no neural fatigue the attractor states will be persistently active until the network is deterred into a new state by external stimulation.B: Bistable network with one default state and several coding attractors.This bistability is achieved by either scaling up the network or increasing mutal inhibition between cell assemblies.At the onset of simulation we have here stimulated a specific coding attractor.At t=t lt (broken line) the network will again exit this state but now jump into the ground state.The network will remain in this state until one of the coding attractors are stimulated.C: Bistable network with added synaptic augmentation.Solid lines show the network state just after the stimulated attractor has terminated due to neural fatigue, and the network has retreated to its ground state.After some time t, larger than the fast decay of neural fatigue but smaller than the decay of the more long-lasting synaptic augmentation, the energy landscape is altered (broken lines).During this time window the network is likely to jump back in to the previously active attractor spontaneously.Conceptually, we consider minicolumns, rather than single cells, as the basic functional units of the network.This means that information processing does not rely on single cells but on recurrently connected neuronal populations.This perspective recently obtained additional experimental support [42] and has several important implications.Firstly, the connectivity within the network on the unit level can be increased without affecting the biologically realistic connectivity on the single cell level [22].Since a pyramidal cell receives roughly 10 000 synapses, full cell-to-cell connectivity is not possible, even within a small cortical volume.With minicolumns acting as computational units, a closer approximation of the full connectivity, assumed in theoretical studies of attractor networks, can be obtained (another factor that reduces the need for full cell-to-cell connectivity is the dense local inhibition implementing disynaptic connections between a vast number of pyramidal cells).Secondly, since the average output of each minicolumn rather than that of a single cell reflects the activation of a distributed memory pattern, memory retrieval is more robust to cellular variability or synaptic failures.In this light, irregular and rare firing of individual pyramidal cells does not undermine the stability of the retrieval process.On the contrary, this irregular firing is instead the manifestation of a dynamically regulated network state where the population activity does not depend on spike timing and firing rates of individual cells.In consequence, the network function is robust to cell death and synaptic loss [88].Without adjusting the synaptic weights, more than 50% of cells could be removed with no detrimental effects to the attractor retrieval dynamics (Figure 6).As regards the removal of synapses, it can be performed in two different ways.First, if connections are removed from one cell at a time, a similar effect can be obtained by simply removing cells.Second, in the scenario where individual connections are removed at random the network becomes slightly more sensitive, but still tolerates a synaptic loss of roughly 40%.This number can be increased to 60% if the loss is compensated by increasing the conductance of the remaining synapses [88].Functional Brain Mapping and the Endeavor to Understand the Working Brain receives slightly stronger excitation and will sustain its activity while the cell at the bottom will be suppressed (pattern rivalry).The cell in the middle didn´t receive any stimulation but belonged to the winning assembly and becomes active (pattern completion).B: Global spiking dynamics demonstrating completion and rivalry.Two assemblies receive brief, partial input to 1/3 of their pyramidal cells at t=0.5 s (green) and t= 1.5 s (cyan).These inputs quickly spreads so that the full patterns are activated (pattern completion).At t=5 s both patterns receive stimulation simultaneously, but the green pattern receives 10% stronger input.This pattern quickly activates at the cost of the cyan pattern.At t=6 s the green pattern again receives 10% stronger input but due to the recent activation it is partly fatigued and the cyan pattern prevails.

Scaling the network and the emergence of bistability and alpha oscillations
Since the scale of the original model was small relative to a cortical area in terms of the number of hypercolumns and minicolumns (while the number of cells within each minicolumn was consistent with biological evidence), it becomes relevant to investigate whether biologically plausible neural dynamics and attractor function can be maintained at much larger simulated scales.For instance, the question as to whether a large distribution of axonal delays can coexist with stable and coherent activations of cell assemblies should be addressed.In addition, it is important to show that the relatively few connections that each pyramidal cell can form are sufficient for stable memory retrieval even at cortical scales.In order to handle these questions, we scaled the network considerably, up to the size of mouse cortex containing 22 million neurons and spanning 16 cm 2 [20].
Due to the modular structure of the network, and arguably cortex, it is indeed scalable with largely preserved dynamics.Once hypercolumns are scaled to realistic size, only the density in the connections across them has to be re-scaled in order to maintain the dynamical regime as the network grows.As the number of hypercolumns in the network was increased, we kept the number of long-range (cross-hypercolumnar) connections terminating on each pyramidal cell constant, progressively diluting the probability that two distant neurons connect.Since biological neurons have limited physical space to make connections on their dendrites, an equivalent process seems likely as in vivo systems are scaled up.As a result, a single cell sees roughly the same amount of excitatory and inhibitory input once an attractor state is entered regardless of network size.Dynamics and function during attractor retrieval were maintained even for the largest simulations without any parameter changes [20].The transitions to and from attractor states turned out surprisingly coherent, even though the slowest time delays within each assembly were 50-60 ms.This effect was again, as described above, obtained due to the interdependence of minicolumns in each pattern mediated by the gamma cycle dynamics and the network operating in a balanced regime, where only small changes in excitation are needed for state transitions to occur.
Despite largely preserved attractor retrieval dynamics there are functional and dynamical consequences of scaling up the network.Most importantly, another dynamical state of the network emerges [11,20] in addition to the aforementioned active attractor coding state (Figure 2B) once each hypercolumn has more than 25 minicolumns.Since this new state becomes the default condition of the bistable (Figure 3B) network in the absence of any external stimulation, it is referred to as the ground state.It is in our network manifested by global alpha-band (~10-20 Hz) oscillations (Figure 2B) and is characterized by very low levels of activity in all minicolumns without a dominance of any patterns.This state is facilitated by the mutual competition between attractor patterns [11], stabilized by feed-back inhibition growing with the network size.In the smaller network, noise fluctuations quickly activated one pattern at the expense of the others leading to a sequential recall of the patterns in a random order.In the larger network, on the other hand, it is possible to maintain the state of competition between attractor patterns as long as there is no sufficient bias to one of them, thus the emergence of a new stable state.This bias could be either in the form of external stimulation of a specific pattern or internal mechanisms such as synaptic facilitation, which we used to store a subset of patterns in working memory ( [18,19]; see section Multi-item working memory).
In the scaled-up bistable network, successful pattern activation by an external cue is coupled to a transition in the oscillatory dynamics from the alpha to gamma rhythm (Figure 2B).Similar stimulus induced transitions have been reported in layer 2/3 of the visual cortex in vivo [66,89].In the context of extensive experimental work on neural oscillations, our two distinct network states correspond with a general view that alpha reflects idling or pre-stimulus readiness (for a review see [90,91]) and gamma is a correlate of active processing ( [61]; for review see [64,65]).
What are the mechanisms underlying these rhythms, and, more importantly, the transition between them in our network?In balanced networks with oscillatory population activity and irregular firing, the oscillatory frequency is dependent, among other factors, on the level of overall excitation in the network [92].Comparing spiking populations in the two stable network states, the excitation level and firing rates are higher during active memory retrieval, thereby increasing the oscillatory frequency relative to the ground state.At the limit where the recurrent excitation within cell assemblies is just strong enough to promote stable attractor states, the switch from the ground state to one of the coding attractors is associated with a minimal increase in excitation and oscillatory frequency.Although the total amount of spikes elicited from the pyramidal cell population as a whole remains the same, after stimulation all spikes are elicited from the active cell assembly, i.e. the combination of single minicolumns across all the hypercolumns, instead of being spread out between all pyramidal cells as in the ground state [17].This effect occurs since cells in the active assembly climb slightly faster to firing threshold in each oscillatory cycle, thereby shutting down competitors before they get a chance to spike and influence the network dynamics.It illustrates how a very small bias can have a strong impact on the spiking in oscillatory, balanced networks.As recurrent excitation is increased, the gap between the oscillatory frequencies in the two states also widens towards a clear distinction between alpha and gamma rhythms, hence reflecting the gradual stabilization of the active state.To maintain attractor-coding activity, a cell assembly has to oscillate faster than the ground state frequency.Towards the end of the attractor's lifetime the oscillatory frequency drops due to adaptation and the network consequently falls back into the ground state.
As with the balanced regime, the bistable regime with two simultaneously stable states exists also in non-oscillatory networks [11].However, the advantage of oscillatory networks amounts to the fact that the parametric range of the bistable regime becomes much wider and less sensitive to perturbations in excitation [17].The strong feedback inhibition needed for a stable ground state does not destabilize the active attractor states.On the contrary, it has relevant functional and dynamical implications for the network during memory retrieval, as discussed in the previous section.
In general, neural oscillations as a population phenomenon occur due to strong feedback inhibition that periodically shuts down activity in a network, and therefore typically destabilizes persistent activity in a cell assembly [93].However, if this cell subset is biased in any way, in our case by the long-range excitation out of phase with respect to the local oscillations, the persistent activity in the oscillatory regime becomes extremely stable instead.Once the network can tolerate periodic hyperpolarization without terminating the activity permanently, strong feedback inhibition can be used to dynamically balance fast changes in excitation.Then, as long as the inhibition is strong enough to periodically shut down the network, it remains roughly balanced.

Multi-item working memory
Attractor networks have been proposed as a modeling framework for a working memory system, which temporarily maintains a small subset of memory items.Models of spatial working memory have for instance used persistent activity in bump-attractor networks to preserve a trace of a specific direction [8,93].We can obtain a similar effect in our network when the adaptation mechanisms are subdued.Then a stimulated attractor will remain persistently active as a cued memory over several seconds [17].The persistent activity approach is limited however since only one item or direction can be stored at any given time due to the mutual inhibition between attractors, whereas working memory is reported to contain up to seven items simultaneously [94,95].In this section we discuss an alternative approach to working memory maintenance known as periodic replay [10,14,18,19,84,96], which allows for storing multiple items.Although in both working memory models only one attractor can be active at any given time, in the periodic replay paradigm it has a brief lifetime instead of being persistently active.The encoded items are then retrieved in a sequence one after another and get periodically reactivated.In computational networks this effect can be achieved by incorporating either cellular [96] or synaptic [10,14,18,19] mechanisms that adjust the excitability of activated neurons dynamically.In the latter case, it can be achieved by adding synaptic augmentation, observed in prefrontal neuronal subgroups [97], on top of faster synaptic depression in a bistable attractor network.On the single synapse level, this makes the conductance vary dynamically over time.During a brief pre-synaptic spike train the amplitude of excitatory post-synaptic potentials (EPSPs) remains static or slightly decreases over time due to the combined effect of synaptic augmentation and synaptic depression.However, due to the slower decay of the augmentation, a new spike arriving roughly one second after the initial burst elicits a significantly magnified EPSP.On the cell assembly level, this implies that an attractor that has been activated by stimulation is temporarily more excitable than the ground state some time after its termination (Figure 3C).During this window it has a high chance to spontaneously reactivate and in the process refresh the synaptic augmentation.This way, a pattern stimulated initially becomes periodically reactivated.During silent periods there is an opportunity for other assemblies to be replayed (Figure 7a).Due to the decay of augmentation, the subset of memory patterns selected for replay need to be reactivated within the decay time window following their last deactivation in order to maintain their elevated excitability.As a consequence, a limited number of items can be stored.In particular, up to ~6 attractor memories can be simultaneously augmented and hence periodically reactivated [18,19] for biologically realistic levels of synaptic augmentation.
The notion that individual memory objects are replayed at a theta time-scale during working memory maintenance has support from human MEG recordings [84].The model can also explain the widely reported finding that alpha-band power decreases [98,99] while gammaand theta-band power increase [67,[98][99][100] with working memory load.We obtain this effect (Figure 7C) since for each additional memory item encoded in working memory, the network spends on average shorter time in the alpha-dominated ground state and longer time in its active retrieval state, correlated with nested theta-gamma oscillations [18].The effect saturates at the full memory capacity of the network.
The notion of theta-coupled replay of memory items with accompanying theta-gamma phaseamplitude coupling is also consistent with single-cell spike statistics obtained from recordings in prefrontal areas and superficial layers of cortex, where a relative abundance of cells displaying clumpy-bursty behavior with Lv [101] and CV 2 [102] well above 1 was observed [103,104].This clumpy-bursty behavior can be reproduced when single cells burst in specific theta periods and are silent in the other ones as is seen in the periodic replay paradigm [19].
Although the estimated variability during the active theta periods results in Lv close to 1, the inclusion of long inter-spike intervals (ISIs) introduced by the silent theta periods boosts Lv to 1.5 (Figure 8), as reported for clumpy-bursty cells in vivo.This effect occurs for firing rates within a certain range, overlapping with the ones observed in our network model [19].
Finally, we would also like to present unpublished results from a study aimed at reproducing the phenomenon of recency and primacy effects [105] in list-learning paradigms.When a list of items exceeding the capacity of working memory is to be remembered by a subject, there is a marked tendency for objects from the beginning (primacy) and the end (recency) of the list to be recalled with a greater likelihood.To simulate this, the network is presented with 10 memory items at the rate of 1 s -1 followed by a 10 s period corresponding to a free recall phase.On average in 100 trials, 5.0±0.7 (mean ± standard deviation) items are maintained such that they are replayed in the recall phase.In addition, memory items in the beginning and at the end of the list are more frequently encoded than those presented in the middle (Figure 7B).The simulated recency effect can be explained by the fact that augmentation in the assemblies activated towards the end of the presentation period is relatively high when the free recall period starts.The primacy effect, on the other hand, can be explained by the fact that the network has time in between presentations to replay these items already in the presentation phase, and thus re-enforce their increased excitability.If the network is largely denied this opportunity by presenting the list of items in quicker succession (at the rate of 2s -1 ), around five items are again maintained in working memory (4.9±0.8),but the first items now have the smallest chance of being remembered (Figure 7B).At their cost, the last items instead have an even elevated chance of being replayed during the free recall period.

Attentional blink
Attractor networks also allow us to study attentional mechanisms and their functional consequences.Attentional effects can be incorporated into such models in several different ways.For instance, it has been studied how top-down activity can bias certain attractors at the cost of others and thus serve as a model for top-down attention [13,106].Generally, in our work we rather focus on the potential neural manifestations of attention and examine how they correlate with the network's capability to retrieve weakly stimulated memory pattern.In that vein, we are currently studying the effects of both phase and power modulations of ongoing alpha oscillations on the network's performance.Here, however, we want to discuss results related to the attentional blink phenomenon [107][108][109][110].It is concerned with an inability to detect and process two relevant stimuli presented in quick succession by humans; the first item masks the perception of the second one even if they are presented equally long.This masking effect is not maximal when the visual targets are shown immediately one after another, but instead when the relative delay is around 300 ms [108,109].The attentional blink phenomenon was correlated with the P300 component [108] and evoked gamma oscillations in the electroencephalography signals [109].In the period of time closely after activation of one stored pattern in the network, triggering another pattern requires more stimulation than otherwise.Here, one pattern was first activated at t = 0.After some delay, a second stimulus was applied to a different pattern, attempting to trigger its activation.Two data series (rings and crosses) are shown, corresponding to separate experiments (random seeds).Each data point shows the minimum number of minicolums in a pattern that have to be stimulated in order to activate the second pattern after a given delay.Third degree polynomia have been fitted to the data points.
In the network, we obtain qualitatively similar time-dependent attentional blink effect [16] related to evoked gamma oscillations.This is due to the fact that an activated cell assembly attains a peak in firing rates after some delay relative to the stimulation.The delay corresponds to the time needed for the recurrent network to build up activity before the adaptation causes the reduction in rates again.Other competing assemblies are maximally suppressed and thus harder to activate at this peak of activity.We associate this effect with the impaired ability to detect the subsequent stimuli in consistence with psychophysical data ( [108,109], Figure 9).This phenomenon was recently studied in more detail using the same network model [58].

Summary and conclusions
We have reviewed evidence that the neural activity of superficial cortical layers is to a large extent compatible with the non-linear dynamics displayed in recurrent attractor networks.
Research in the field of computational neuroscience has touched upon various aspects of the attractor theory with emphasis on its biological relevance and functional implications.In our modeling work, where we have used a biophysically detailed attractor network inspired by cortical connectivity, we have demonstrated how novel features such as modular structure and oscillatory dynamics render the model more robust and consistent with biological findings.In addition, we have shown how our mesoscopic network model can be utilized to link lower-level neural substrate with higher-order cognitive or behavioral phenomena.In particular, we have conceptually replicated recency, primacy and attentional blink effects.In the light of the network's dynamics we have also motivated the limited capacity of working memory.The model can be perceived as a crude model of the superficial layers of associative cortex taking the form of a large distributed network of attractor networks.Future work is intended to follow the direction of diverging individual cortical areas with respect to connectivity and function.We envisage that this work will be accelerated by the concerted effort in computational neuroscience to study cortical function from a bottom-up perspective.

Figure 1 .
Figure 1.Network setup and connectivity.A: A detailed connectivity of a single hypercolumn, containing 49 minicolumns.B: A sketch of the long-range connectivity within a cortical patch, consisting of several hypercolumns (9 in a full patch).The numbers on the arrows give the connectivity and post synaptic potential (PSP) size at resting potential of the post-synaptic cell.

Figure 2 .
Figure 2. Oscillatory activity in the various network states.A: During attractor retrieval each minicolumn in the active assembly oscillates at gamma frequency(25)(26)(27)(28)(29)(30)(31)(32)(33)(34)(35)(36)(37)(38)(39)(40).All pyramidal spiking within a minicolumn is concentrated to the peak of each oscillation (circles) while the incoming spikes from distant minicolumns are evenly distributed across the whole oscillatory cycle, stabilizing activity within the assembly.B: Bistable network receiving stimulation of one of its coding attractors at t = 2s.This time point marks a transition from alpha like (ground state) to gamma like (attractor state) oscillations (top) and a simultaneous transition from diffuse low rate firing to the concentrated higher rate spiking (bottom) in a specific cell assembly.Spiking from pyramidal cells in this assembly is shown as green dots while all other spikes are depicted as black dots.

Figure 3 .
Figure 3. Cartoon of energy landscapes in various regimes.Solid lines depict the energy of various states at time t=0, and broken lines at a later time point (specified in A-C).The ball indicates what state the network is likely to be in, i.e.one of the states with the lowest energy.A: Network with attractors of limited life-time, t lt (200-800 ms depending on parameters).The network will quickly end up in one of its attractor states at the onset of each simulation.At t = t lt (broken lines) neural fatigue and synaptic depression has increased the energy of this attractor such that noise will bump it to another attractor state.If there is no neural fatigue the attractor states will be persistently active until the network is deterred into a new state by external stimulation.B: Bistable network with one default state and several coding attractors.This bistability is achieved by either scaling up the network or increasing mutal inhibition between cell assemblies.At the onset of simulation we have here stimulated a specific coding attractor.At t=t lt (broken line) the network will again exit this state but now jump into the ground state.The network will remain in this state until one of the coding attractors are stimulated.C: Bistable network with added synaptic augmentation.Solid lines show the network state just after the stimulated attractor has terminated due to neural fatigue, and the network has retreated to its ground state.After some time t, larger than the fast decay of neural fatigue but smaller than the decay of the more long-lasting synaptic augmentation, the energy landscape is altered (broken lines).During this time window the network is likely to jump back in to the previously active attractor spontaneously.

Figure 4 .
Figure 4. Gamma and theta phase locking.Gamma and theta power (A) and gamma-filtered ( black) and theta-filtered (red) components during active sequential retrieval of attractors (B).C: Histogram of spike events in relation to the theta phase.

Figure 5 .
Figure 5. Pattern rivalry and completion.A: Single cell dynamics during pattern rivalry and completion.Two cells (top and bottom respectively), part of two distinct assemblies, receive input.The cell at the top is part of an assembly that

Figure 6 .
Figure 6.Tolerance to cellular death.The stability of attractor dynamics, measured as the dwell time (y-axis) of stimulated cell assemblies receiving brief stimulation.Cells are removed at random from the network (x-axis) without adjusting connectivity or synaptic weights.

Figure 7 .
Figure 7. Multi-item working memory through synaptic augmentation.Synaptic augmentation causes attractors to spontaneously reactive some time after they have been terminated.This augmentation is then refreshed upon each reactivation, and the attractor is held in working memory in a cyclic fashion.A: Here four items are stimulated (at 0.2, 1.2, 2.2, and 3.2 sec), and the space between active recall is filled by ground state activity.The items presented early start their re-activations already during the presentation period (0-3.5 s).This can explain the bias for items presented early in the list to be remembered as seen in (B).Here 10 items are presented followed by a recall phase where we test which items that are replayed.Early (1, 2) and late (9, 10) items have a higher probability to be remembered than intermediate(4)(5)(6) items (blue bars).If the list is presented at the rate of 2 s -1 (red bars), the tendency for early items to be remembered is removed.C: Frequency modulations by memory load.Bars show integrated power in the three different power bands (2-6, 10-18, and 28-40 Hz) and five different load conditions.Bars are normalized relative to the power in Load 1 condition (one memory item), such that power in Load 1 is 1.

Figure 8 .
Figure 8. Scatter plot of Lv for 100 cells drawn from the persistently active network (A) and the replay network (B).The dotted lines mark the range of Lv values within one standard deviation from the mean.

Figure 9 .
Figure 9. Attentional blink.In the period of time closely after activation of one stored pattern in the network, triggering another pattern requires more stimulation than otherwise.Here, one pattern was first activated at t = 0.After some delay, a second stimulus was applied to a different pattern, attempting to trigger its activation.Two data series (rings and crosses) are shown, corresponding to separate experiments (random seeds).Each data point shows the minimum number of minicolums in a pattern that have to be stimulated in order to activate the second pattern after a given delay.Third degree polynomia have been fitted to the data points.