TG and MK conceived and designed the experiments. TG performed the experiments and analyzed the data. TG and MK wrote the paper.
The authors have declared that no competing interests exist.
It is well established that various cortical regions can implement a wide array of neural processes, yet the mechanisms which integrate these processes into behavior-producing, brain-scale activity remain elusive. We propose that an important role in this respect might be played by executive structures controlling the traffic of information between the cortical regions involved. To illustrate this hypothesis, we present a neural network model comprising a set of interconnected structures harboring stimulus-related activity (visual representation, working memory, and planning), and a group of executive units with task-related activity patterns that manage the information flowing between them. The resulting dynamics allows the network to perform the dual task of either retaining an image during a delay (delayed-matching to sample task), or recalling from this image another one that has been associated with it during training (delayed-pair association task). The model reproduces behavioral and electrophysiological data gathered on the inferior temporal and prefrontal cortices of primates performing these same tasks. It also makes predictions on how neural activity coding for the recall of the image associated with the sample emerges and becomes prospective during the training phase. The network dynamics proves to be very stable against perturbations, and it exhibits signs of scale-invariant organization and cooperativity. The present network represents a possible neural implementation for active, top-down, prospective memory retrieval in primates. The model suggests that brain activity leading to performance of cognitive tasks might be organized in modular fashion, simple neural functions becoming integrated into more complex behavior by executive structures harbored in prefrontal cortex and/or basal ganglia.
Before we do anything, our brain must construct neural representations of the operations required. Imaging and recording techniques are indeed providing ever more detailed insight into how different regions of the brain contribute to behavior. However, it has remained elusive exactly how these various regions then come to cooperate with each other, thus organizing the brain-scale activity patterns needed for even the simplest planned tasks. In the present work, the authors propose a neural network model built around the hypothesis of a modular organization of brain activity, where relatively autonomous basic neural functions useful at a given moment are recruited and integrated into actual behavior. At the heart of the model are regulating structures that restrain information from flowing freely between the different cortical areas involved, releasing it instead in a controlled fashion able to produce the appropriate response. The dynamics of the network, simulated on a computer, enables it to pass simple cognitive tests while reproducing data gathered on primates carrying out these same tasks. This suggests that the model might constitute an appropriate framework for studying the neural basis of more general behavior.
An important unanswered question in neurobiology is how neural activity organizes itself to produce coherent behavior. Lesion, electrophysiological, and imaging studies targeting specific cognitive functions have provided very detailed insights into how different regions of the brain contribute to behavior. More specifically, they have shown the role of various regions of cortex in implementing functions such as visual representation of stimuli [
Here, we propose that adequate behavior can be generated from the set of functions mentioned above if the information these cortical regions contain and exchange with each other is managed by executive or control structures in a manner suiting the task at hand. Brain-scale activity coding for integrated behavior might then be constructed by these executive units, from a repertoire of simple neurocognitive functions, which would be selected, recruited, ordered, and synchronized to implement the necessary neural computations.
To illustrate this hypothesis, we present a neural network model able to pass the mixed-delayed response (MDR) task, which was introduced to study memory retrieval in the monkey using visual associations [
The MDR task consists of randomly mixed DMS and DPA trials [
Length of trial periods used: cue, 0.5 s; delay (divided in subdelays: d1, 0.3 s; d2, 0.4 s; and d3, 1.0 s); choice, 0.5 s; and response, 0.5 s.
Excitatory and inhibitory neurons (represented by green and red dots, respectively) are arranged in two-dimensional layers P and WM and interconnected by short-distance connections (not shown). Layer VR is composed of four excitatory units (green squares), each representing a group of cortical cells coding for a single image, and four inhibitory units (red squares), which implement lateral inhibition on excitatory VR units. Layers P, WM, and VR are connected via diffuse and homogenously distributed vertical excitatory projections (green arrows). All connections in the network are fitted with standard Hebbian learning algorithm, while downward connections have in addition reinforcement learning. Each P neuron receives a single priming connection from either the sustain (represented by a violet arrow) or recall (orange arrow) unit. Layers WM and P are the targets of reset units (reset WM and reset P) which, when active, reinitialize to zero the membrane potential and output of all neurons in the layer. Units Gu, Gd, Iu, and Id gate activity which travels from VR to WM, WM to VR, WM to P, and P to WM, respectively (dark blue lines). This gives the network the freedom to either transfer information from one layer to another, or to isolate layers so that they can work separately. Visual information from the exterior world enters the network via the Inputs variables, which feed stimulus-specific activity into layer VR (turquoise arrows). Letters between parentheses indicate tentative assignation of network components to cortical or subcortical areas (see
46, area 46; BG, basal ganglia; DL, dorsolateral; IT, inferotemporal; OF, orbitofrontal; OM, orbitomedial; V1, primary visual cortex. Network areas and layers: P, planning; T, task; VR, visual representation; WM, working memory.
The complex overall activity of the network is directed by the coordinated binary firing (either “on” or “off”) of control units, each implementing a basic function: gatings regulate the upward and downward flow of information, resets bring processors back to a null state of activity, and the task units prime activity in layer P. The units' firing patterns are grouped in three sets, each corresponding to a different task: one for fixation trials (A), one for DMS trials (B), and one for DPA trials (C). These firing patterns specify the neural computations performed by the network to pass each task. Note that task parameters and notations are as in
(A) The fixation task only requires that the network observes the sample image presented at the beginning of each trial. To do this, the network first clears from its WM and P layers any activity left over from the preceding trial, and then allows visual information from the presented sample to rise into these layers.
(B) The DMS task generalizes the fixation task, requiring that the network retains the observed sample image during a delay to then match it against target and distractor images during the choice period. These operations are implemented by the above additional activities in the firing patterns of gatings Id, Gu, and Gd (see “Analysis of MDR Task Performance” for details).
(C) The DPA task is identical to DMS except that the network needs to retrieve during subdelays d2 and d3 the image associated with the sample. This recall process is implemented during these periods by additional activities for gatings Iu and Id, and the reset WM unit (see “Analysis of MDR Task Performance” for details).
The model (see
The present model expands on this circuitry, placing it partly under the control of hierarchically higher systems: a task layer (T), a planning memory layer (P) together with its own reset module (reset P), and two new gatings, Iu and Id (
The new task layer T consists of two units with mutually exclusive activities, sustain and recall. It forms the neural implementation of what we propose is the absolute minimal set of processing actions necessary to pass the DMS and DPA tasks. The sustain unit is active whenever the network maintains a memory of the sample image. It is therefore active throughout DMS trials since in this case the sample image is also the target, and during the first half of DPA trials. By contrast, the recall unit is active only when the network recalls the image associated with the sample and maintains its representation in working memory. It is therefore always silent except during the second half of DPA trials, where the network performs target retrieval (
The planning layer P is a higher level working memory with circuitry identical to that of WM, including its own reset unit. P neurons receive excitatory connections from layer WM (
The flow of information between layers WM and P is regulated by gating areas Iu and Id (
The WM–VR circuitry enables the network to perform the operations of sample recognition, sample storage, and the selection as target of the image currently held in working memory. These circuits are now complemented by the novel layers P and T, which allow autonomous modification by the network itself of the content of working memory. The necessary neural computations are guided by the ensemble of gating and reset units and their firing patterns (
Initially, with all connections of equal strength, the model performs the tasks at no better than chance level, even though the control units already possess their mature firing patterns (
The fixation task (
The network is then subjected to DMS training. There, the neural assemblies mentioned above, which act as building blocks for the network, are connected (together and with units of layer VR) to form macrocircuits spanning the whole system: the clusters present in layer WM now form neural representations of the images used for the task, while each circuit in layer P codes for the plan to use a different sample image as target. This latter phase occurs as a result of the reinforcement-learning algorithm that modulates the vertical connections and makes use of the positive and negative reward signals dispensed to the network at the end of each trial to reinforce the productive connections. Connections whose operation tends to increase positive reward become stronger and therefore end up dominating the dynamics of the model. Successful DMS training takes an average of about 41 trials (SD = 19), producing a network that passes DMS with a success rate of 90% or more.
DPA task learning follows last (
We found that in approximately 50% of runs, the network successfully learned all tasks. In 4%–6% of runs, the network did not learn any task at all while in the remaining runs the network managed to pass DMS only.
For further details on the network's dynamics, the reader is referred to short movies of the network passing DMS and DPA trials available on the website of the journal (see
Each trial starts with the presentation through the Input layer of a sample image, which is stored in the network as neural activity spanning the VR and WM layers (i.e., a representation of this image; see
For easier comparison, the two trials share the same sample (image 2) and distractor (image 3) but differ by their targets (circled in green, DMS trial: 2, DPA trial: 1).
The layers T, P, WM, and VR of the network are represented in three dimensions, one on top of the other. Green (red) squares in layers P, WM, and VR represent firing excitatory (inhibitory) neurons. The green arrows represent a sample of the vertical connections strengthened during the learning phase, when the gating controlling them is open. Task units are represented as follows: empty dot represent inactive units; violet and orange dots represent active sustain and recall units, respectively. Priming arrows (violet and orange arrows) are only represented when the corresponding task unit is active. Task parameters and notations are as in
d3-a, first 0.8 s of subdelay d3; d3-b, last 0.2 s of subdelay d3; choice-a, first 0.2 s of choice period; choice-b, last 0.3 s of choice period.
This plan is set into motion during subdelay d1, as image 2 is hidden (
At the beginning of subdelay d2, a signal is presented to the subject to indicate whether the current trial is of type DMS or DPA. In the case of live subjects, this signal is conveyed by a change in the monitor's screen color [
In the case of a DMS trial (
At the end of subdelay d3, the influence of working memory layer over the visual representation layer is momentarily blocked (
Alternatively, if the task signal indicates a DPA trial, the target will not be image 2, but rather its paired associate, image 1. Since this image has to be retrieved from memory, the network first switches activity in the task layer from the sustain to the recall unit (
Before this is done, however, the network first removes, at the beginning of subdelay d3, the representation of image 2 from working memory. This is necessary to make way for the weaker top-down signal inducing activity specific to image 1. Then, by directing once again the information flow from layer P to layer WM, the planning activity generates in working memory the representation of target image 1 (
The last phase of the trial (i.e., the target perception and selection) proceeds exactly as for the DMS trial above, except that the network now chooses image 1 as target instead of image 2 (
The activity during DMS and DPA trials of excitatory layer VR cells is summarized in
Comparison between simulated (A) and experimental (B) units is restricted to the cue and delay periods (left of the dashed line on theoretical data). Activities to the right of the dashed line (choice and response periods) are predictions of the model. Notation: x → y + z = trial with sample x, target y, and distractor z. Other notations are as in
(A) Activity of excitatory unit 2 of visual representation layer VR during DMS and DPA trials where images 1 and 2 are used as sample and target (see
(B) Recordings of a single IT neuron during four pair-association with color switch [
VR cell firing comprises two components. The first component is produced by the one-to-one connections that rise from the Input layer representing lower visual areas (
VR activity during DPA trials is more complex, reflecting the changes in neural activity taking place in layer WM during the recall of the target image. As a result, if image 2 is used as sample (
Conversely, in DPA trials where image 2 is the target and has to be recalled from its association with sample image 1, cell VR(2) is first silent (periods cue and subdelays d1 and d2) but then starts firing at the beginning of subdelay d3 when the representation of image 2 is generated through recall in layer WM (
WM cells exhibit firings that are both image specific and strongly dependent on the trial type and trial period. These characteristics are illustrated at the population level by
Each line represents the number of active WM neurons at any instant of a particular trial. The pink and blue curves correspond to DMS trials, where images 4 and 3 are used as sample, respectively. These curves illustrate that different neural assemblies represent the sample image when it is perceived by the network, or when this representation is subsequently sustained. For instance, when image 3 is used as sample (blue curve), fewer cells are mobilized by the presentation of the image (cue period) than by its memory sustained during the delay.
The black curve corresponds to the DPA trial where the sample and target are images 4 and 3, respectively. The pink area indicates the amount of WM cells mobilized by the evoked and sustained representations of image 4. The blue area denotes the number of cells making up the representation of image 3 recalled by association. This latter representation, in the case of image 3, clearly mobilizes fewer neurons than either the evoked or sustained representations of that same image.
The strong stimulus specificity of WM cells is demonstrated by the different number of neurons that fire during DMS trials featuring different sample images. For instance, in
Each of the boxes contains three raster plots (top, center, and bottom rows), which represent the response of a single cell to a DMS trial where the cell's preferred image is the sample (top of each box), a DPA trial where the cell's preferred image is the sample (center row in each box), and a DPA trial where the cell's preferred image is the target (bottom).
Cells: For each class are listed the total number of cells in the class (black), the number of excitatory/inhibitory cells in the class (green/red), and the percentage of the total number of WM neurons the content of this class represents.
Specificity: details the number of cells responding to each of the four images. Note that each type FWM cells responds to more than one image.
The cells contained in these eight classes represent a sample of 725 cells out of the 900 contained in layer WM. We found that a comprehensive classification of the cells' activities require at least 27 classes to be complete and that the distribution of cells in these classes follows a power law (see
Neurons were first grouped into classes according to their firing patterns during DMS and DPA trials (See
Type AWM cells are the most common, firing whenever the cell's preferred image is perceived, kept in memory, or recalled. Such cells therefore participate in all three image representations mentioned earlier. Cells in the other classes exhibit similar firing patterns though they lack certain of its components. For instance, type DWM units respond to the presentation of images, but they are unable to fire when the network recalls this image or sustains its representation. They therefore only participate in the evoked representation of images. By contrast, type HWM cells only fire when images are retrieved during DPA trials. They consequently only contribute to the representation of recalled images. Type BWM cells, which do not fire during the cue period, participate in the sustained and recalled representations of images, but not the evoked one. A similar analysis can be extended to the remaining classes of
The diversity in firing patterns present in layer WM is a direct consequence of the structure of the network. Indeed, as shown on
The activity of mature P cells holds more information than that of neurons in layers VR and WM: it depends not only on the images submitted to the network during the current trial, but also on the processing applied to these images (i.e., sustaining the memory of the sample image or recalling the image associated with it).
This task-image duality in the cells' firing is produced by the combined effects of the priming connections projected onto layer P by the sustain and recall units of the task layer, and the image-specific signal sent from layer WM. Since two processing actions are possible (sustain and recall), and a total of four images are used for the task, P cells part themselves in eight different groups according to their function in the network's dynamics (e.g., sustaining sample image 1, recalling target image 1, etc.). The first four groups each contain cells primed by the sustain unit, stabilizing the representation of one among the four images used in the MDR task.
Each line represents the number of layer P neurons that fire at every instant during DMS and DPA trials. The pink and blue curves correspond to DMS trials where images 4 and 3 are used as sample, respectively. All P cells active during these trials are primed by the sustain task unit, and code for the project of sustaining the representations of sample images 4 and 3, respectively, that are harbored in layer WM.
The black curve represents the number of cells firing during a DPA trial where image 4 is the sample and image 3 is the target. The light pink area denotes cells firing to sustain the representation of sample image 4. This set of cells has a large overlap (fluctuating between 75%–90%) with the cell population that was firing during the same period of the 4 → 4+1 DMS trial (pink curve). Such variability is a direct consequence of the randomness inherent to the network's dynamics. At the beginning of subdelay d2, when the network is instructed to perform the DPA task, activity in the task layer switches from the sustain to the recall unit. This abrupt modification in P cell priming creates a sudden reorganization of the cellular activity present in the layer: all previously firing layer P neurons are now primed into a quiet state by the silent sustain unit. Simultaneously, all cells primed by the now active recall task unit are free to fire. Those that do fire form the representation of the project to recall image 3 (light blue area).
Layer P cells have firing patterns which code for either memorizing or recalling an image. Each box contains three raster plots (top, center, and bottom), which represent the response of a single cell to a DMS trial where the cell's preferred image has to be sustained (top), a DPA trial where the cell's preferred image has to be sustained (center), and a DPA trial where the cell's preferred image has to be recalled (bottom).
Cells: For each class are listed the total number of cells in the class (black), the number of excitatory/inhibitory cells in the class (green/red), and the percentage of the total number of P neurons this class represents.
Role/priming: For each class, the table specifies whether neurons are involved in sustaining sample images, or recalling the targets. It also specifies the image specificity of the cells.
For type AP cells and “Others,” we only specify the number of cells primed by each unit.
After successful completion of DMS training, the network is able to recognize images, sustain their representation during a delay, and pick them during the choice period. To pass the DPA task, the network must in addition be able to recall the image associated with the sample (i.e., to create a recalled representation of the target image from a learned association).
As described earlier, the network learns to associate images in pairs by a process of trial and error. After the successful completion of DMS training, layers VR and WM act as a “winner-takes-all” network during the choice period: when presented with the target and distractor images, the network is constrained to select one or the other. Which one the model chooses depends on the activity currently harbored in working memory: whatever image is currently memorized in layer WM will be chosen by the model (see
In the particular run used to gather the data plotted in the figure, DMS training was completed at trial 48, and DPA training started at trial 49. For clarity, we only present on this figure the trials where images 2 and 1 are the sample and the target, respectively (activity of cell populations during the trials in between, which feature other images, are not represented). Trials failed by the network are represented in red. Succeeded trials are displayed in green.
(A) The areas represent the number of WM cells tuned to image 1 that fire at every instant of DPA trials where this image is the target (i.e., where image 1 has to be recalled from association). The figure shows the evolution of neural activity coding for the recalled representation of image 1. It first appears during the choice period of trial 55 as the network is presented with this image. At the next “2 → 1” trial (trial 58), this firing has become prospective, appearing at the start of subdelay d3 before image 1 has been presented to the network. The number of active cells varies between 0 and 120.
(B) The area represents the activity of the excitatory cell VR(1). As discussed above, VR cells act in the network as visual extension to the content of working memory layer WM. VR(1) activity therefore virtually mirrors that emerging in layer WM during training, as can be seen by comparing the buildup of prospective activity in (A) and (B).
The associations between images are coded in the network by the connections projected by layer P neurons onto layer WM cells. Taking the example of associating target image 1 with sample image 2 (illustrated in
Reward signal and reinforcement algorithm are crucial to selectively strengthening these connections, therefore securing the correct associations and discarding incorrect ones. Choosing the distractor image triggers the release of negative reward, which resets, through the reinforcement-learning algorithm, all connections projected by active P cells onto firing WM neurons (trial 49;
This evolution in the network connectivity leads to modifications in the activity of layer WM neurons, as a new recall component emerges to complement their previous firing pattern. Further,
Each curve illustrates the evolution during DPA training of the average number of WM cells firing during subdelay d3, and tuned to a particular image. In the run used to produce the data plotted on the figure, DPA training started at trial 81. The graph includes both succeeded and failed trials: the number of active cells increases when the network receives positive reward, and it decreases when negative reward is dispensed to it. Each curve roughly follows a sigmoid function, reaching a plateau in a limited number of successful trials.
The network dynamics during the task compares well with behavioral and electrophysiological data gathered on the monkey performing similar tasks.
Just like the monkey, the network solves the MDR task in a prospective manner: they both take advantage of the task-specific signal presented during the delay to predict and retrieve the upcoming target image before it is actually shown. This prospective strategy for passing the task was demonstrated for the animal by a detailed analysis of error patterns, reaction time, and electrophysiological data [
At the neural level, it can be seen that the firing pattern of VR layer neurons (
We now compare the activity of neurons in the WM and P layers (
Comparison between simulated (A) and experimental (B) units is restricted to the cue and delay periods (left of the dashed line on theoretical data), although the model makes predictions for the choice and response epochs as well. Notation: x → y + z = trial with sample x, target y, and distractor z. Other notations are as in
(A) Average firing patterns of cells of layer WM (neurons α and β) and layer P (neuron γ). For clarity, each graph represents the response of cells for only two trials (their firing for the other two trials being at background level). Task parameters and notations are as in
(B) Recordings performed in monkey dorsolateral PF cortex (modified from
The cell illustrated at the top of
The cell illustrated in the middle of
The last cell (
We tested the robustness of the model's dynamics at the neural and network structure levels by modifying parameters such as the membrane time constants, the parameter defining the overall scale of the cells' threshold, the misfire probability of cells, or the number of connections present within or between layers. In all cases we found that network performance varied either little, or very gradually, as a function of these changes.
We also studied at the systemic level the effect on the network of perturbing the control units' firing patterns. Indeed, out of simplicity, the results described above where obtained using simple binary activities for all control units (see
To test the robustness of the network's dynamics, the firing patterns of control units were perturbed by adding to them white noise of varying strength A.
The top curve (
The next three curves are examples of activities perturbed with noises of amplitude
A similar analysis extends to disturbing the other control units of the model.
Each curve represents the success rate of the mature network for a given task when the firing pattern of one or all control units are perturbed by noise of amplitude
For most units and tasks, network performance varied little with increasing perturbation amplitude. In the other cases however, we found a marked sigmoid-type decline in performance.
Simulations showed that perturbing a single or all control units produces little or no effect on the network's dynamics as long as
Increasing the value of
This resistance to local perturbations stems from the structure and dynamics of the network itself: the system is cooperative by construction with distinct units interacting to process the necessary neural computations. If one unit misfires, thereby failing to produce the activity required for the next step of the task, the results of previous neural computations will most likely still be present in the network until the perturbed unit does decide to fire. The reset of layer WM during DPA trials is a good example: even if unit reset WM eliminates repeatedly during subdelay d3 the activity in layer WM coding for the recalled target image, planning activity coding for the recall of this target is still safely stored in layer P and ready to be used during the choice period. In addition, the lateral inhibition present in the network is sufficient to minimize, or even control, the effect of any parasitic activity created by the spontaneous firing of control units. An example of such activity is the firing during DMS trials of layer WM and P cells tuned to the recall of the image associated with the sample. It arises when perturbation are applied to the firing pattern of the recall unit, and it can interfere with the behaviorally correct activity harbored in layers WM and P coding for the sample image. However, as shown on
The ability of layers WM and P to sustain activity gives considerable leeway to the system: to pass the DMS and DPA tasks, the system must accomplish a series of neural computations in a precise order, and in a manner synchronized with the periods of the task. However, because of the ability of layers WM and P to sustain activity for up to several seconds, the precise instant when these computations take place, or even if they are repeated several times, have little influence on the network's performance. This explains the limited loss of performance when damaging most gating units.
We found the system's dynamics to be much more sensitive to perturbations during the training phase (not shown). We observed in fact that the earlier perturbations are taking place, the more severe their effects are on the dynamics.
This result is a direct consequence of the way in which circuitry emerges in the network. Indeed, as described above (see Learning Phase), network organization starts with the construction during the fixation task of neural clusters in layers WM and P, whose activity represents images and plans, respectively. As long as this crucial first phase is not totally completed, neural circuitry in layers WM and P is extremely labile: immature clusters can easily fuse together, depriving the network of the ability to treat different images or plans as distinct objects, and therefore ruining any chance of successfully completing the training phase. Such cluster collapse is certain to take place if information travels back and forth between pairs of layers (e.g., layers VR and WM, or WM and P). To keep this from happening, it is therefore essential that during this early part of training, gatings Gu and Gd (and Iu and Id) are never open at the same time [
This provision is no longer satisfied when large perturbations are added to the firing patterns of all control units, since they will certainly generate frequent simultaneous opening of gatings Gu and Gd, or Iu and Id. This will in turn result in the collapse of clusters in layers WM and P, and the complete breakdown of network performance for all tasks. In contrast, starting to apply these perturbations later during training (i.e., when cluster integrity is already higher) will have a smaller effect on network circuitry and task performance.
To our knowledge, two models, each using attractor neural network dynamics, have been proposed so far to study the neural mechanisms implementing memory recall in the framework of the DPA and DMS tasks [
These two models have in common with the present work the use of re-entrant connections to produce stable, image-specific activity representing the observed sample image and the expected target image. However, in our case, neurons are arranged in two-dimensional layers and the excitatory connections which link them are exclusively short range. Re-entrant neural circuits therefore organize themselves locally in the layer instead of spanning the whole network, as is the case in the above two models.
In all three models, the retrieval of the target image takes place as the system leaves the attractor corresponding to the activity coding for the sample image, and moves toward that coding for its paired-associated. However, the different mechanisms through which this is achieved, and how associations between images are built, differ sharply. While in the models described above [
Also, in the attractor neural network models above, associations are constructed through a slow process (either through Hebbian or more complex guided forms of learning) during the choice period of trials where neural activities coding for the sample and target images coexist. In that framework, the notion of reward, which is an intrinsic part of the protocol for the DMS and DPA tasks with live subjects, is absent from the dynamics of the learning or mature network. The network in the present work adopts the complementary view: reward is essential to DPA training, and is used by the network as a guide to learn the correct associations and discard incorrect ones. Indeed, during DPA trials, the network picks one of the two images presented to it during the choice period. This image is then the candidate that the network tentatively suggests to form a pair with the sample image presented before the delay. The reward signal dispensed to the network by the “experimenter” is then used by the model to figure out whether this choice is correct or incorrect.
It is interesting to compare the characteristics of these models with the known mechanisms for the retrieval of memories in primates that were studied using MDR-type tasks [
Behind the complex behavior exhibited by the model as it passes the DMS and DPA tasks, there is an interesting dynamic that takes place in the network both at the cellular and circuit levels. Indeed, classification of WM neurons according to their firing patterns, revealed that the distribution of cells in these classes follows a power law (see
Though simple compared to the central nervous system, the present model captures key biological features of several brain areas thought to take part in the neural processing necessary for the DMS and DPA tasks. We will examine each of them below.
The model manages to be trained for both the DMS and DPA tasks using two biologically realistic synaptic modification algorithms. The first is the Hebbian [
Moreover, the layers of the network implement functions typical of known brain areas or regions.
Layer VR is the point of convergence of both the bottom-up visual component coming from primary visual areas, and the top-down signal originating from working memory. As was shown when comparing simulated and experimental data, the activities of VR neurons fit those measured in the IT cortex during the cue and delay periods of both DMS and DPA trials [
Layer WM implements in the model several functions typically identified with PF circuitry: it is capable of sustaining activity coding for objects, whether they are perceived [
Both these interpretations are further supported by the qualitative agreement between the connectivity linking the PF and IT cortices, and that exchanged by layers VR and WM: they are both dense, two-way, and also subject to strong reward-type innervation [
The planning activity stored in layer P introduces a prospective component in the network's behavior. Planning results from the convergence of task-related signals and activity emanating from the image representation held in working memory. This characteristic firing pattern, which as shown above fits well observed data, brings support to the hypothesis of a memory retrieval mechanism relying on planning circuits harbored by the dorsolateral cortex. The assumption that the planning circuitry required for the DMS and DPA tasks is located in the PF cortex agrees with experimental studies, suggesting a role for the PF cortex in tasks involving high-level planning, such as the Tower of London task [
Units in layer T, which implement task-related higher processing, have a function similar to cells that were observed in monkey PF cortex and whose firings are specific to tasks [
The reset units of the model, which are dedicated to removing behaviorally irrelevant activity from the network, have a function close to the “inhibitory control” proposed by Fuster [
Gating units Gu, Gd, Iu, and Id provide the model with the ability to control the flow of information it harbors. Several experimental studies seem to indicate that such mechanisms exist in the brain. One example is the electrophysiological study by Miller et al [
In addition to these observations, mechanisms to shield information from competing neural activity or to manage the flow of information along neural pathways have already been proposed in the PF cortex [
Indeed, Sakai et al. [
Also, Graybiel and colleagues [
We suggest that the regulation of the flow of information between the PF and IT cortices illustrated by the present model could be implemented either by separate loops which have been reported as linking the basal ganglia with the temporal association area TE via the thalamus [
In order to pass the DMS and DPA tasks, live subjects must have an understanding of what they are required to do. In other words, they must hold in their mental space a neural activity that represents these tasks, or at the very least, which codes for the different operations required to perform them. Animals are able to create this neural task representation in a few training sessions, using the reward dispensed by the experimenter as only feedback on their actions from the exterior world.
The present model in contrast does not learn the DMS and DPA tasks: all the computational aspects of the tasks are already precoded in the firing patterns of the network's control units (see
One important question it addresses is how much the behavior of a live subject is sensitive to the details of the neural activity that codes for it, or, in other words, whether a given task has only one unique neural representation. The present model seems to suggest that it is not the case. Indeed, as we saw above, the network can function with an acceptable degree of success in spite of very extensive perturbations to the firing patterns of its control units. This clearly indicates that there is a large spectrum of firing patterns that the control units could adopt that would still lead to task success. This is partially due to the fact that, as mentioned earlier, after just a few trials, information sustains and protects itself in the network with reentrant excitatory connections and lateral inhibition. From then on, the exact timing of neural operations, or the number of times they are repeated, seems to be irrelevant as long as they occur in the right order and each within a certain time window. The firing patterns displayed in
The network is composed of leaky integrate-and-fire neurons represented by three dynamical variables [
Units in the network are interconnected into layers that are themselves connected together, for a grand total of more than 85,000 connections. Each of the neurons forming layers WM (containing 900 neurons) and P (containing 900 neurons) are interpreted as representing a single cell of the PF cortex. Units in layers VR (containing eight units) are meant to model whole arrays of IT cortex cells. Therefore, before comparing VR variables with experimental data, we need to reinterpret as firing probability or rate of activity the difference
Each connection between two neurons in the network is represented by a synaptic strength
Gating units Gu, Gd, Iu, and Id are implemented in the model by binary variables that take the value 1 when the corresponding gating is opened, and 0 when it is closed (
Each run starts by building the network anew as connections, neuron types (either excitatory or inhibitory) in layers P and WM, and the task unit priming each P neuron are all drawn using a random number generator. The model is then subjected to fixation, DMS, and then DPA trials, with reward dispensed according to the rule of the task. We choose as success criterion for a task the completion of a series of 20 correct trials in a row where the network “points” unequivocally to the target during the choice period. Excitatory units of layer VR, which are unequivocally stimulus specific, are used to “read” the network's response since we did not model a motor cortex in the network. Once DMS training is successfully completed, all connections modified during this learning (i.e., connections exchanged by layers VR and WM, those exchanged by sustain-primed P cells and WM neurons, and those harbored within layer WM) are then effectively kept “frozen” during the subsequent DPA training. This precaution, not unlike the overlearning procedure used with animals, ensures that DMS knowledge is not damaged during the DPA learning phase. The synapses of projections exchanged between WM neurons and recall-primed P cells, which have not yet been recruited by the dynamics, will now be modified to complete the circuitry necessary for DPA task performance. Complete runs are then repeated a large number of times (typically 100) to measure the probability of success for the different tasks.
(8.3 MB MOV)
(8.1 MB MOV)
We thank J.-P. Changeux for his support and for many stimulating discussions. We also thank R. Klink and M. Zoli for discussions and help with the manuscript. TG gratefully acknowledges the support of V. Gisiger and R.-M. Gisiger throughout this project.
delayed-matching to sample
delayed-pair association
inferior temporal
mixed-delayed task
prefrontal