Skip to main content
Advertisement
  • Loading metrics

A Common Cortical Circuit Mechanism for Perceptual Categorical Discrimination and Veridical Judgment

  • Feng Liu,

    Affiliation Department of Physics, Nanjing University, Nanjing, People's Republic of China

  • Xiao-Jing Wang

    xjwang@yale.edu

    Affiliation Department of Neurobiology and Kavli Institute for Neuroscience, Yale University School of Medicine, New Haven, Connecticut, United States of America

Abstract

Perception involves two types of decisions about the sensory world: identification of stimulus features as analog quantities, or discrimination of the same stimulus features among a set of discrete alternatives. Veridical judgment and categorical discrimination have traditionally been conceptualized as two distinct computational problems. Here, we found that these two types of decision making can be subserved by a shared cortical circuit mechanism. We used a continuous recurrent network model to simulate two monkey experiments in which subjects were required to make either a two-alternative forced choice or a veridical judgment about the direction of random-dot motion. The model network is endowed with a continuum of bell-shaped population activity patterns, each representing a possible motion direction. Slow recurrent excitation underlies accumulation of sensory evidence, and its interplay with strong recurrent inhibition leads to decision behaviors. The model reproduced the monkey's performance as well as single-neuron activity in the categorical discrimination task. Furthermore, we examined how direction identification is determined by a combination of sensory stimulation and microstimulation. Using a population-vector measure, we found that direction judgments instantiate winner-take-all (with the population vector coinciding with either the coherent motion direction or the electrically elicited motion direction) when two stimuli are far apart, or vector averaging (with the population vector falling between the two directions) when two stimuli are close to each other. Interestingly, for a broad range of intermediate angular distances between the two stimuli, the network displays a mixed strategy in the sense that direction estimates are stochastically produced by winner-take-all on some trials and by vector averaging on the other trials, a model prediction that is experimentally testable. This work thus lends support to a common neurodynamic framework for both veridical judgment and categorical discrimination in perceptual decision making.

Author Summary

In daily life, we constantly face two types of perceptual decisions: to identify an object feature (what is the speed of that car?) or to discriminate the same feature among two or more possible categories (is that car going faster than the speed limit?). These decision processes appear to involve very different computations: while identification relies on an analog judgment, categorical discrimination is based on a comparison of the object feature with discrete options. Do they engage entirely separate brain mechanisms? In this work, we showed that these two types of decision making can be instantiated by a single cortical circuit. We used a continuous recurrent network model to simulate two monkey experiments in which subjects were required to make either a two-alternative choice or a veridical judgment about the direction of random-dot motion. The model reproduced salient experimental observations and makes testable predictions. The results demonstrate that a common cortical circuit can perform both categorical discrimination and veridical judgment. Conceptually, this work supports the notion that a cortical circuit endowed with reverberatory dynamics can fulfill multiple cognitive functions such as working memory and decision making.

Introduction

Perceptual judgments involve detection, identification and discrimination of objects in a sensory scene [1],[2]. Given an ambiguous visual motion pattern, for instance, a subject may be asked to detect whether a net motion direction is present or absent [3], to identify the motion direction as an analog quantity [4], or to discriminate the motion direction between two options (e.g., left or right) [5]. Using the strategy of single-unit recording from behaving monkeys, neurophysiologists have begun to uncover neuronal activity that is linked to such perceptual judgments (for reviews, see [6][11]). In monkey experiments using perceptual discrimination tasks, neural correlates of decision making have been observed in the parietal [12],[13], premotor [14][16] and prefrontal [17],[18] cortical areas. Experimental observations have inspired the advance of neural circuit models which suggest that recurrent (attractor) network dynamics can underlie temporal integration of sensory information (accumulation of evidence) and decision formation [18][25].

Focusing on categorical discrimination, those neural circuit models as well as abstract ramp-to-threshold models [26][30] are typically endowed with a simple architecture consisting of discrete neural pools, selective for categorical alternatives. Therefore, they are inadequate for exploring perceptual identification that requires neural representation of analog quantities, such as motion direction that can be an arbitrary angle between 0° and 360°. On the other hand, probabilistic estimation of an analog stimulus feature has been studied from the perspective of optimal population coding [2],[31],[32]. These studies centered on optimal algorithms for reading out a stimulus feature from sensory neural populations, such as inferring the orientation of a visual stimulus from neural activity in the primary visual cortex [33] and the direction of a motion stimulus from activity profiles across the middle temporal visual area (MT) [2]. However, such probabilistic inference is believed to occur in higher-order cortical areas downstream from primary sensory areas, and the underlying circuit mechanism remains unclear. In particular, it is unknown whether probabilistic estimation and categorical discrimination engage distinct decision processes or can be realized by a shared neural circuit mechanism.

In the present work, we investigated this outstanding question using a continuous recurrent network model of spiking neurons, which was initially proposed for spatial working memory [34]. We applied this model to the simulation of two monkey experiments using random-dot visual motion stimuli. In a two-alternative forced-choice direction discrimination task (Figure 1A), the monkey was trained to discriminate the motion direction by making a saccadic eye movement to one of two peripheral choice targets [12],[13],[35]. It was found that ramp-like spiking activity of neurons in the lateral intraparietal cortex (LIP) is correlated with the monkey's choice. By contrast, in a direction identification task (Figure 1B), the monkey was required to report veridically its perceived direction of motion in the visual stimulus [4]. On some trials, electrical stimulation was applied simultaneously to MT neurons when the monkey viewed the random-dot display. Microstimulation could bias the monkey's judgments toward the preferred direction of MT neurons at the microstimulation site [4],[36]. It was argued that both vector-averaging and winner-take-all algorithms might contribute to the interpretation of activity profiles of MT neurons. But [4] collected only behavioral data and did not record neural activity in MT or downstream cortical areas. Thus, the neural mechanism for veridical judgments about the motion direction remains unknown.

thumbnail
Figure 1. Schematic depiction of two monkey experiments that were simulated by the continuous recurrent network model.

(A) Reaction-time version of a two-alternative forced-choice direction discrimination task. A trial began when the monkey fixated a point on the display monitor. Two choice targets then appeared in the periphery. One was within the response field (RF) of the recorded neuron, and the other was in the opposite hemisphere. After a delay, a dynamic random-dot display appeared, where a fraction of dots moved coherently toward one of the two targets while the others moved randomly in all other directions. The monkey was allowed to make a saccadic eye movement toward a target at any time when it was ready. (B) Direction identification task. After fixation, a random-dot motion stimulus appeared inside a target ring and lasted 1 s. When the fixation point was extinguished, the monkey made a saccadic eye movement to the location on the target ring toward which the dots had moved. On some trials, electrical stimulation was simultaneously applied to MT neurons.

https://doi.org/10.1371/journal.pcbi.1000253.g001

Here we show that the continuous recurrent network model is capable of reproducing salient observations from both experiments. Our results suggest that both categorical discrimination and veridical judgment can be subserved by a common cortical circuit endowed with reverberatory dynamics.

Materials and Methods

Network Architecture

Our model is designed to simulate two perceptual decision tasks in which the decision is about the net direction of a random-dot motion stimulus. Since the directional angle is a one-dimensional quantity, we used a continuous network model in which each neuron is selective for a motion direction, from 0° to 360°. Our model network does not directly map onto LIP, in which neurons have response fields in a two-dimensional visual space. However, our model is adequate for simulating the two tasks, and we do not anticipate that a two-dimensional version of our model would behave in qualitatively different ways.

The model network is composed of NE pyramidal cells and NI interneurons. The network architecture is consistent with a columnar organization [34],[37]. Cells are spatially distributed on a ring according to the motion direction to which they are most sensitive (Figure 2A). Each neuron is labeled by its preferred direction θi, which is uniformly distributed between 0° and 360°. Simulations were done with NE = 2048 and NI = 512.

thumbnail
Figure 2. Network architecture and input signals.

(A) Schematic illustration of network structure. The network is composed of 2,048 pyramidal cells and 512 interneurons. Excitatory cells are labeled and arranged by their preferred motion directions (from 0° to 360°). The connectivity between pyramidal cells is structured, and the synaptic strength is a Gaussian function of the difference between their preferred directions (solid curve). Connections to or from inhibitory interneurons are broad. (B) Spatial profile and time course of input rates in the direction discrimination task. External inputs to the network from two targets and the motion stimulus are separately modeled as excitatory synaptic currents mediated by AMPA receptors, with presynaptic spikes emitted based on Poisson processes. Poisson rates are depicted in the figure as a function of preferred directions of neurons and time: the maximum input rate from two targets, the input rate from the motion stimulus for four different motion strength, and their corresponding time courses, respectively (from top to bottom). For the target input, the effects of spike-rate adaptation and divided attention upon stimulus onset are included. (C) Spatial profile of input rate in the direction identification task. The inputs from both the motion stimulus and microstimulation are modeled as excitatory synaptic currents. The profiles of Poisson rate are shown for four different stimulus directions with the microstimulated direction fixed at 90°.

https://doi.org/10.1371/journal.pcbi.1000253.g002

Neurons and Synapses

Both pyramidal cells and interneurons are described by leaky integrate-and-fire neurons and are characterized by six parameters [34]: the membrane capacitance Cm, the leak conductance gL, the resting potential EL, the threshold potential Vth, the reset potential Vreset, and the refractory time τref. The values used were: Cm = 0.5 nF, gL = 25 nS, EL = −70 mV, Vth = −50 mV, Vreset = −59 mV, and τref = 2 ms for pyramidal cells; Cm = 0.2 nF, gL = 20 nS, EL = −70 mV, Vth = −50 mV, Vreset = −59 mV, and τref = 1 ms for interneurons. Below Vth, the membrane potential Vi(t) of cell i obeys the following equation:where Ii,syn represents the total synaptic current flowing into the cell.

The network is endowed with pyramidal-to-pyramidal, pyramidal-to-interneuron, interneuron-to-pyramidal, and interneuron-to-interneuron connections (Figure 2A). For the sake of simplicity, only the connectivity between pyramidal cells is structured. Recurrent excitatory currents are mediated by AMPA receptors (AMPARs) and NMDA receptors (NMDARs), while inhibitory currents are mediated by GABAA receptors (GABAA Rs). External excitatory inputs include those from MT neurons, which represent visual motion stimuli and electrically elicited directional signals. When simulating the categorical discrimination task, additional inputs represent the presentation of choice targets. All neurons also receive background synaptic input mimicking spontaneous activity outside the local network. In simulations, all these external currents are mediated exclusively by AMPARs.

The total synaptic current in pyramidal cell i is given bywherewith VE = 0 mV and VI = −70 mV. Ii,back represents background synaptic input. Ii,AMPA and Ii,NMDA denote recurrent excitatory inputs, while Ii,GABA represents recurrent inhibitory input. The maximum synaptic conductances are denoted by (pyramidal-to-pyramidal), and (interneuron-to-pyramidal), respectively. We shall describe Ii,ext in the following sections.

For interneuron i, the total synaptic current is described similarly except for Ii,ext = 0 as well as different synaptic conductances (pyramidal-to-interneuron), and (interneuron-to-interneuron).

The synaptic strength between two pyramidal cells i and j depends on the difference between their preferred directions and is described as or with . If θ>180°, it is set to θ−360, and if θ<−180°, it is set to θ+360. This is done to satisfy the periodic boundary condition, which is also imposed on the following Equations 2–5. Note that W(θ) is normalized asW(θ) with and σw = 18° is shown in Figure 2A (solid curve).

The gating variables, i.e., the fractions of open channels, are described as follows. The AMPA (external and recurrent) synaptic variable obeys the following equation:(1)where the decay time constant was set to τAMPA = 2 ms, and the sum over k represents a sum over spikes emitted by presynaptic neuron j [19]. In the case of background noise, also obeys Equation 1, where spikes are emitted based on a Poisson process with a rate of 1.5 KHz independently from cell to cell. The maximum conductances were set to and . NMDA currents have a voltage dependence that is controlled by the extracellular magnesium concentration, [Mg2+] = 1 mM. Thus, the NMDA channel kinetics are modeled aswith τNMDA,decay = 100 ms, α = 0.5 ms−1, and τNMDA,rise = 2 ms [19]. The GABA synaptic variable obeys the following equation:with τGABA = 10 ms. All synapses have a latency of 0.6 ms.

In simulations of the discrimination task, the maximum recurrent synaptic conductances (in µS) were taken as , , , , , and . These conductances are scaled inversely proportionally to the number of pyramidal cells and of interneurons, respectively. This is to keep the total synaptic conductances unchanged when network size is varied. With these parameter values, NMDAR channels contribute 85% to recurrent excitatory charge entry at a holding potential of −65 mV. To simulate the identification task, we decreased the conductance values except . Meanwhile, we increased the ratio of to and of to so that the overall recurrent inhibition is decreased. The following values were used: , , , , , as well as and σw = 14°. In this case, NMDAR channels contribute 83.5% to recurrent excitatory charge entry at a holding potential of −65 mV. Three features are worth noting. First, recurrent excitation is taken to be primarily mediated by NMDARs [38]. Second, the network is dominated by recurrent inhibition [34]. Third, neurons receive a large amount of background noise.

Two-Alternative Direction Discrimination Task

To simulate a two-alternative direction discrimination task [13],[35], the presentation of two choice targets at θ1 and θ2 is modeled through selective synaptic input to the pyramidal cells whose preferred directions are close to either θ1 or θ2. The random-dot motion stimulus is represented by MT neurons, which project to LIP. Therefore, the external input to pyramidal cell i is assumed to be Ii,ext(t) = Ii,tar(t)+Ii,stim(t) with and obey Equation 1, with spikes discharged according to Poisson processes with rates and , respectively.

depends on the preferred direction θi of each cell and varies with time; it is described aswith(2)where t0 and t1 represent the onset times for the targets and the stimulus, respectively. The function h(t) models the spike-rate adaptation of upstream neurons encoding the targets and the presumed divided attention upon stimulus onset. The adaptation time constant τad was set to 80 ms. Upon the stimulus onset, the strength of target input is assumed to be reduced, presumably resulting from a cross inhibition between upstream neurons separately signaling the motion stimulus and the targets, or because the subject's covert attention is shifted from the targets to the stimulus. Consequently, the neural activity decreases momentarily, resembling a brief ‘dip-and-rise’ in firing rate of LIP neurons. We used the following values: θ1 = 90°, θ2 = 270°, σtar = 13°, t0 = 500 ms, t1 = 1300 ms, and gtar = 12 nS (Figure 2B). The specific parameter values in R(θi) and h(t) are not so important, provided that the input from the targets is sufficiently strong to trigger high neural activity before stimulus presentation.

Based on the tuning curves of MT neurons during the presentation of a random-dot display [39], is modeled as(3)with c′ (0≤c′≤1) denoting motion strength and θ1 the direction of coherent motion. We used the following values: r0 = 100 Hz, r1 = 30 Hz, r2 = 90 Hz, σstim = 40°, and gstim = 5.9 nS (Figure 2B). Note that there is a latency for visual signals to arrive in LIP, which was assumed to be 200 ms [29],[35].

Direction Identification Task

The simulations used the same protocol as in [4]. Pyramidal cells in the model circuit receive excitatory synaptic input from MT neurons representing both the motion stimulus and the electrically evoked directional signal. MT activity is broadly tuned to visual motion stimuli, characterized by tuning curves with a typical width at half-height of ∼90° [39][42]. On the other hand, we assume that microstimulation activates a much narrower range of MT neurons and also evokes lateral inhibition from interneurons. As a result, the external input is described aswhere obeys Equation 1, with spikes emitted based on a Poisson process with a rate μi. In the presence of only the visual stimulus,(4)

In the presence of microstimulation alone,(5)

As a first-order approximation, μi = μs(θi)+μm(θi) in the presence of both the visual stimulus and microstimulation, which are delivered simultaneously and last a fixed duration of 1 s. Equation 4 is similar to Equation 3. The second term on the right-hand side of Equation 5 is to mimic lateral inhibition from interneurons; the third term is to ensure μm positive.

The directional angles θ1 and θ2 denote the coherent motion direction in the random-dot display and the preferred direction of MT neurons at the microstimulation site, respectively. We assume A0 = 7−3.5c′ and A1 = 49c′ (in units of Hz) with c′ being the stimulus coherence level. As in the experiment, c′ was always set to 80% representing a vivid suprathreshold stimulus unless specified otherwise. This is so because the experimental study aimed to investigate the interaction between this suprathreshold motion stimulus and microstimulation at varying angular distances. Other parameter values were chosen so that the maximum firing rate of cells at stimulus offset is comparable when microstimulation or the visual stimulus is presented alone. The values used were: A2 = 86.8 Hz, α = 0.25, β = 0.05, σ1 = 21°, σ2 = 33°, σstim = 40°, θ2 = 90°, and gstim = 6.1 nS. θ1 varied with trials.

The angular difference Δθ = |θ2θ1| can be used to classify neural activity. For a small Δθ, there is a significant overlap between the two inputs, μs and μm, and there is a relatively large value in between two peaks (Figure 2C). For a large Δθ, the two inputs are nearly independent of each other.

Readout of the Direction Judgment

For both the direction discrimination and identification tasks, we used the same measure to read out direction judgment. It is determined by a population vector scheme as follows [43]:where ri is the instantaneous firing rate of cell i, of which the preferred direction is θi. Especially, the value of θPV at stimulus offset is denoted by θE, which represents a direction estimate on individual trials. ri is calculated as follows. For each time window of 40 ms (with a sliding window being 5 ms), the total spike number is counted and divided by the time window.

For the reaction-time version of the discrimination task, we also read out decision time based on threshold crossing of neural population firing rates. Specifically, we calculated the instantaneous population firing rates, r1 and r2, of two neural pools separately centered at θ1 and θ2, each consisting of 140 cells and spanning 360°×(140/2048)∼24°. That is, each pool consists of cells with their preferred directions within ∼±12° around θ1 or θ2. The time bin was 40 ms, and a sliding window of 5 ms was used to smooth data. Decision time is calculated by assuming that a decision is made whenever r1 or r2 first reaches a prescribed threshold, which was set to 57 Hz to fit behavioral data. Decision times can be compared with experimentally recorded reaction times by adding a non-decision response time ∼70 ms (i.e., the additional time it takes for a monkey to generate a saccadic eye movement after a choice is made).

Numerical Method

The trial-averaged population firing rates were obtained by averaging over 1000 correct trials (Figure 3C). Moreover, to visualize network activity, spatiotemporal maps of firing rate are shown in Figure 3B. A spike time rastergram for all pyramidal cells was smoothed with a sliding window both in time (50 ms) and along the neural population (10 neurons). The resulting firing rate was color coded. The integration method used is a modified second-order Runge-Kutta algorithm [44], with a time step of 0.02 ms.

thumbnail
Figure 3. Network activity during the direction discrimination task.

(A) (Top) Spatiotemporal firing pattern of pyramidal cells with the stimulus at zero coherence. x-Axis, time; y-axis, cells labeled by their preferred directions. Two targets are separately presented at 90° and 270° (indicated by arrows). The targets and the motion stimulus are presented at 500 ms and 1,300 ms, respectively. But there is a latency (about 200 ms) for the visual signal to reach LIP. (Bottom) Time course of the population firing rates for the two neural pools, each consisting of 140 neurons and separately centered at 90° (r1, black) and 270° (r2, red), and for the neurons whose preferred directions are at least 26° away from 90° and 270° (blue), respectively. (B) Network activity patterns shown with a color-coded firing rate map for three coherence levels. The coherent motion direction is 90° (indicated by triangles). (C) Time course of population firing rates r1 (solid curves) and r2 (dashed curves), averaged over 1,000 correct trials, for various coherence levels. See Results for detailed description.

https://doi.org/10.1371/journal.pcbi.1000253.g003

Results

We will first report model simulations of the categorical discrimination task [13] and assess how well the model reproduces the monkey's performance as well as LIP activity that appears to reflect the decision computation. We will then use the same model to simulate the direction identification task involving the microstimulation of MT [4]. We will examine how a continuous recurrent circuit, endowed with strong reverberatory dynamics, can integrate sensory information and make categorical choices in the discrimination task or instantiate both the winner-take-all and vector-averaging mechanisms for direction judgments in the identification task.

Two-Alternative Forced-Choice Direction Discrimination Task Graded Ramping Neural Activity and Categorical Competition

Model simulations used the same protocol as in the reaction-time version of a two-alternative direction discrimination task [13]. Figure 3A displays typical network activity in response to both two targets and a random-dot motion stimulus at zero coherence. The network activity is monitored by plotting its spatiotemporal firing pattern (upper panel). A trial begins with the network in a resting state in which cells exhibit low spontaneous firing. Two targets are then separately presented at θ1 (90°) and θ2 (270°), instructing the network two choice options. In response, two neural pools separately centered around θ1 and θ2 show persistent elevated activity, with neural discharges quite asynchronous. Thus, the profile of network activity exhibits two symmetric ‘bumps’ separately centered at θ1 and θ2. That is, there is no winner-take-all competition in the symmetric state. This has also been observed in [23] and can be understood as follows. In our model, recurrent excitation is dominated by the NMDARs-mediated current, which saturates at high firing rates [38]. The winner-take-all mechanism requires not only global inhibition but also recruitment of synaptic excitation. This recurrent excitation saturates at (symmetric) high firing rates, and thus no winner-take-all occurs.

Upon the onset of motion stimulus, neural activity decreases transiently owing to a reduced efficacy of target input (see Methods). The biological origin of this reduction is currently unknown; possible scenarios include a cross inhibition between upstream neurons separately signaling the targets and the motion stimulus and that the subject's covert attention may be shifted from the targets to the stimulus. After the visual signal reaches the decision circuit (with a latency of 200 ms), the two neural pools integrate the signal and compete against each other through shared inhibitory feedback from interneurons. Eventually, one neural pool wins the competition and increases its activity, while the other's activity is greatly suppressed, leading to a categorical choice. Note that winner-take-all competition occurs even when the stimulus input is uniform across the network. This is interpreted as follows. The symmetric state with high firing rates is stable only for sufficiently strong inputs. It disappears and is replaced by asymmetric states (with one of the two bumps growing while the other shrinking) when the target input is reduced to lower levels after stimulus onset, similar to the behavior of a model network composed of discrete neural pools [23].

The decision process can be revealed by showing the time course of population firing rates, r1 and r2, of the two neural pools separately centered around θ1 and θ2 (see Methods). In response to target presentation, r1 and r2 initially display a drastic increase followed by an adaptation to ∼40 Hz (Figure 3A, lower panel), resembling the LIP response to target presentation [12],[13],[35]. After the motion stimulus is delivered, both r1 and r2 first decrease and then rise together to nearly the same level as before stimulus onset. Such a dip-and-rise has been widely observed in experiments [13],[35],[45],[46]. Afterwards, r1 and r2 begin to diverge over time, with r2 climbing up while r1 decaying down in this example. This subserves the formation of a binary decision. A choice is made when r2 reaches a prescribed threshold. Throughout the decision process, there is a dynamic balance between recurrent excitation and inhibition, as the activity of interneurons builds up in parallel with that of winning pyramidal cells (data not shown). This excitation-inhibition balance is important for ensuring network stability and, together with background synaptic noise, contributes to stochastic network dynamics. Given the stimulus at zero coherence, this stochasticity determines the choice outcome on any given trial, and thus the decision is at chance level across trials.

Figure 3A also displays the time course of the mean firing rate of the pyramidal cells which are not activated directly by the two target inputs (blue curve). After the presentation of two targets, since the two activated neural pools (in the “bumps”) excite interneurons, which in turn send feedback inhibition globally to the entire network, those pyramidal cells show a suppressed activity compared to the spontaneous state. After the visual stimulus reaches the decision network, those cells also receive an extra external activation (e.g., the motion stimulus is uniform at zero coherence). Meanwhile, the feedback inhibition decreases because of the drop of neural activity in one of the two bumps. These two factors combined lead to the increase of firing activity of those cells.

In the monkey experiment, coherence level or motion strength c′ refers to the fraction of dots that move coherently in one particular direction (e.g., 90°) while the others move randomly in all other directions with a uniform distribution in the random-dot display. This is implemented in the model as bell-shaped input profiles (see Figure 2B), which mimic the activity profiles of MT neurons at different coherence levels [39]. In Figure 3B is shown the network activity on single trials with stimuli at nonzero coherence levels. After two targets are presented, two bumps separately develop around θ1 and θ2. Since the targets exist throughout the trial, they ‘instruct’ the network two choice options and always exert an influence on the decision process. After stimulus onset, neural activity first decreases briefly and then rises. Furthermore, there is a transition from the symmetric state to the asymmetric state, where one bump eventually becomes predominant over the other. This transition occurs faster with increasing coherence level.

Figure 3C displays the time course of population firing rates r1 and r2, averaged over correct trials, for different c′ values. Immediately after stimulus onset, there is a dip-and-rise in population activity, which is independent of motion strength, similar to the observation from LIP neurons [13],[35]. About 200 ms after stimulus onset, r1 and r2 begin to diverge and vary in a ramp-like pattern, which underlies the network's temporal integration of sensory inputs. The ramping activity is faster with a larger slope at higher c′. Moreover, at lower c′, immediately after the dip-and-rise, the firing rate of the winning pool shows a momentary plateau for ∼100 ms before it ramps up (see red, green and blue solid curves). This biphasic behavior (i.e., plateau-and-ramp) has been observed in LIP activity [30],[35] and in our previous model [21]. Therefore, the graded ramping activity reflects the quality of sensory evidence, and the ultimate divergence in spiking rate of competing neural pools gives rise to a choice. Figure 3C is remarkably similar to the LIP activity observed experimentally (see Figure 7A in [13] and Figure 5A in [35]). Note that only one neuron was recorded at a time in the experiment. Nevertheless, the simulation results can be compared with the physiological data, if the activity of the winning (respectively losing) pool is mapped onto that of an LIP neuron on trials when the monkey's choice is toward (respectively away from) its preferred direction. Therefore, the model reproduces the salient characteristics of LIP activity correlated with perceptual decision making.

Psychometric Function and Decision Time

The model network's performance is measured as follows. For each c′ value, simulations are run thousands of times, and the choice on each trial is read out according to which of the two neural pools wins the competition or based on the population vector θPV. Figure 4A shows 20 traces of θPV with the stimulus at zero coherence. Clearly, when either population firing rate first reaches a threshold (57 Hz), θPV is exactly or almost equal to θ1 or θ2. As we shall see later, direction judgment in the identification task is also based on the population-vector analysis. Thus, the network uses the same readout scheme in both tasks.

thumbnail
Figure 4. The network's performance and population activity during the direction discrimination task.

(A) Time course of population vector. Twenty traces are shown with the stimulus at zero coherence. (B) The probability of correct choices versus motion strength. Data (circle) are fitted by a Weibull function with α = 6.85% and β = 1.45 (solid curve). (C) Time course of population firing rates r1 (black) and r2 (blue), averaged over correct (solid curves) and error (dashed curves) trials, respectively, for three coherence levels.

https://doi.org/10.1371/journal.pcbi.1000253.g004

The probability of a correct choice on any trial is determined by the percentage of trials on which the winning pool matches the one with a greater stimulus input. Figure 4B shows the psychometric function describing the probability of correct choices versus motion strength. The performance varies from chance to perfect discrimination when c′ is increased from 0% to 51.2%. The data are fitted by a Weibull function [47]:where α is the coherence level at which the performance is 82% correct and β describes the slope of the psychometric function. Our data are fitted by α = 6.85% and β = 1.45, consistent with the experimental values of 6.82% and 1.45 [13].

Figure 4C depicts the time course of population firing rates r1 and r2 averaged over correct and error trials, respectively. Given the coherent motion direction of θ1, the stimulus input to the pool selective for θ1 is larger than that to the other pool selective for θ2. On both correct and error trials, after the visual signal reaches the decision circuit, one pool ramps up it activity and thus ultimately wins the competition, whereas the other ramps down its activity. The population activity for the winner is lower on error trials than on correct trials, while that for the loser is less depressed on error trials. Furthermore, the ramping activity is more gradual on error trials. These differences become increasingly significant at higher coherence levels. This is because the winning neural pool receives less input on error trials than on correct trials, whereas the losing neural pool receives greater input on error trials than on correct trials. These trends have been observed experimentally in LIP activity (cf. Figure 11 in [13]).

In the reaction-time version of the direction discrimination task, the decision time is measured as the time it takes for either of the two population firing rates to first reach a prescribed firing threshold (see Methods). This is in line with the observation that when a saccadic response is triggered, the up-ramping activity of LIP neurons reaches a stereotypical level that is independent of coherence level [13],[35]. The generation of saccadic motor responses is not explicitly modeled here. At each coherence level, the sum of the mean decision time and a fixed non-decision time (about 70 ms) is comparable with the experimentally measured reaction time (Figure 5A). In addition, the mean decision time decreases nearly linearly with c′ on a logarithmic scale, in agreement with the behavioral data [13]. Consistent with the population activity shown in Figure 4C, the mean decision time is longer on error trials than on correct trials. Note that the shape of the histogram for decision time depends remarkably on coherence level (Figure 5B and 5C). At high coherence levels, decision times are narrowly distributed around a short time (Figure 5B). At lower coherence levels, the up-ramping neural activity is slower (Figure 3C), resulting in longer decision times and broader distributions (Figure 5C). Decision times are more variable on error trials (right panels) than on correct trials (left panels). Thus, our model reproduces salient features of reaction times observed experimentally [30].

thumbnail
Figure 5. Decision time in the direction discrimination task.

(A) Mean decision time as a function of motion strength. The mean decision time on error trials (square) is longer than that on correct trials (circle). The solid line is a linear fit to the data (circle). Error bars indicate SD. (B) The decision time histogram for c′ = 51.2% with the binwidth of 50 ms. (C) The histograms of decision time (with the binwidth of 100 ms) on correct (left) and error (right) trials for c′ = 3.2% (top) and 12.8% (bottom), respectively. Decision times are more variable at lower coherence levels. The number of trials used for plotting the histograms are indicated in the panels.

https://doi.org/10.1371/journal.pcbi.1000253.g005

Veridical Identification of Motion Direction

We have shown that a continuous recurrent network model reproduces salient experimental observations in the direction discrimination task [13],[35]. Now we turn to explore whether this circuit model also subserves analog computations underlying veridical judgments about motion direction. The simulations used the task protocol as in [4]. A random-dot motion stimulus was presented for a fixed duration of 1 s, followed by a saccadic eye movement indicating the monkey's judgment. On some trials, electrical stimulation was simultaneously applied to MT neurons for 1 s, and its impact on the monkey's direction estimates was measured. In this task, the monkey had the complete freedom to report veridically its perceived direction of motion in the visual stimulus. This judgment can be drastically different from the stimulus direction (θ1) since microstimulation may bias it toward the preferred direction (θ2) of MT cells at the microstimulation site. The generation of saccadic eye movements is not explicitly modeled.

Neural Integration of the Visual Stimulus and the Electrically Elicited Directional Signal

Figure 6A depicts typical network activity in response to only a motion stimulus with c′ = 80% and θ1 = 200°. Before stimulus presentation, pyramidal cells exhibit low spontaneous activity, which is homogeneous across the population. After stimulus onset, a bell-shaped activity pattern develops around θ1 since the cells with preferred directions around θ1 are most activated. The network dynamics are reflected in the time course of the population vector θPV, which converges to θ1 after initial transients (magenta trace). That is, the stimulus direction can be read out based on the population vector. If only microstimulation is applied to MT cells around θ2 (90°), a bump pattern develops and is centered at θ2 (Figure 6B). At the stimulus offset, active neurons show high firing rates comparable to those in Figure 6A, but the network activity profile is narrower. This results from the assumption that microstimulation activates a smaller number of MT neurons while MT neurons are widely tuned to visual stimuli. These results indicate that the network can represent directional signals by a bump state and that the population vector is a good measure for the network's direction judgments.

thumbnail
Figure 6. Neural activity related to direction identification in a veridical judgment task.

(A) Neural response to the motion stimulus alone. (Left) Spatiotemporal firing pattern of pyramidal cells superimposed by the time course of the population vector (magenta). The arrow indicates the coherent motion direction (200°) of the stimulus. The motion stimulus is presented at 500 ms and lasts 1 s. (Right) Network activity profile at stimulus offset. The firing rate is calculated by counting the number of spikes fired by each neuron within 50 ms preceding the stimulus offset, divided by 50 ms. (B) Neural response to the microstimulation of MT neurons alone. The black arrow marks the microstimulated direction (90°). Same conventions as in (A). (C) Neural response to the simultaneous presentation of the motion stimulus and microstimulation. (Top three panels) Neural activity on three sample trials. (Bottom panels) Time course of population firing rates of two neural pools separately centered at 90° (red) and 200° (black), corresponding to the above three individual trials (from left to right).

https://doi.org/10.1371/journal.pcbi.1000253.g006

When both the visual stimulus and microstimulation are applied simultaneously, the input profile is bimodal with two peaks around θ1 (200°) and θ2 (90°) (cf. Figure 2C). Figure 6C displays the network activity on three trials. Owing to noisy input and stochastic neural dynamics, the network activity varies from trial to trial. On the first trial, one bump develops, and θE, the value of θPV at stimulus offset, approximately equals θ2; that is, the direction estimate corresponds to the microstimulated direction. On the second trial, a single bump develops with θEθ1, and hence the estimate corresponds to the stimulus direction. On the third trial, the network activity profile remains bimodal, and the value of θE is a weighted sum of two coexisting bumps. In this particular example, θE equals 174°, closer to the stimulus direction than to the microstimulated direction.

The model network integrates external inputs in the form of slow ramping activity, as if the motion stimulus and microstimulation provide conflicting evidence for direction judgments. This can be seen in the time course of population firing rates, r1 and r2, of the two neural pools separately centered at θ1 and θ2 (Figure 6C, bottom). On the first and second trials, r1 and r2 first ramp up together and then begin to diverge at a time that varies considerably from trial to trial. After the diverging point, one further ramps up, while the other ramps down. On the third trial, r1 and r2 remain comparable with r1 slightly larger than r2, consistent with the fact that the direction estimate is closer to the stimulus direction. Therefore, even when the motion strength is as high as 80%, the network behavior can be drastically distinct on different trials. This implies that the integration process is essentially stochastic. Moreover, here direction estimates are based on the profile of network activity, i.e., population averaging. If we instead used a scheme in which direction estimate is assigned by the preferred direction of the most active neuron, it would always be around either θ1 or θ2, inconsistent with the behavioral data [4].

Effect of Microstimulation on Direction Judgments

As mentioned above, microstimulation can bias the direction identification. Here, we systematically change the stimulus direction (θ1) to explore the effect of microstimulation (with fixed θ2) on direction judgments. With the protocol as in [4], a motion stimulus is presented at 80% coherence with its coherent motion direction in one of eight directions spanning 360° in 45° increment. In the absence of microstimulation, the profile of network activity is peaked at θ1, and thus θE is around θ1. Figure 7A displays the distributions of θE values on a circle for eight different stimuli. In each case, the data points cluster densely with little variability. The mean value of θE accurately matches the stimulus direction, and the standard deviations are negligible (Figure 7B). Therefore, the network judges the stimulus direction very accurately.

thumbnail
Figure 7. Effect of microstimulation on direction judgments.

(A–B) Direction estimates (θE) in the presence of motion stimulus alone. (A) The distribution of direction estimates on a ring for eight stimulus directions spanning 360° at 45° intervals. (B) The mean direction estimate versus the stimulus direction. The unity slope diagonal represents perfect identification performance on the task. Error bars indicate SD. (C,D) Direction estimates in the presence of both the motion stimulus and microstimulation. (C) The distribution of direction estimates on a ring for eight motion stimuli. Points are staggered radially for visualization purposes. (D) The shift of the mean direction estimate away from the stimulus direction (represented by open circle) due to the microstimulation of MT. The lines and arrows show the amplitude and direction of the shift in the mean direction estimate caused by microstimulation. The black arrow in the center denotes the overall effect of microstimulation on direction estimates, which is also the microstimulated direction.

https://doi.org/10.1371/journal.pcbi.1000253.g007

When microstimulation is applied simultaneously with θ2 = 90°, the resulting distribution of θE values depends on the angular difference between the two stimuli, Δθ = |θ2θ1| (Figure 7C). Qualitatively, three types of effects can be distinguished. First, for a small Δθ (e.g., 45° with θ1 = 45° or 135°), direction estimates from individual trials spread out between θ1 and θ2. Second, for an intermediate Δθ (e.g., 135° with θ1 = 225° or θ1 = 315°), the distribution of θE values is discontinuous; most estimates cluster around either θ1 or θ2, but other estimates scatter between the two directions. Third, for a large Δθ (e.g., 180° with θ1 = 270°), the distribution of θE values is bimodal, narrowly centered at θ1 and θ2.

Figure 7D depicts the shift of the mean value of θE away from the stimulus direction because of microstimulation, which can bias direction estimates toward the microstimulated direction. This effect occurs over nearly the whole range of stimulus directions (except for Δθ = 0° or 180°). To show the overall effect of microstimulation, we calculated both the center-of-mass of all single-trial direction estimates in the absence of microstimulation and that in the presence of microstimulation. The black arrow in the center of Figure 7D denotes the direction of the vector from the nonstimulated to the stimulated center-of-mass, which is just the microstimulated direction.

Mixed Strategy of Winner-Take-All and Vector Averaging

To understand the above three types of probabilistic direction identification, we investigated the network dynamics as Δθ was systematically varied. When Δθ is small, the input profile is unimodal, or there are two peaks but one is much shorter than the other (cf. Figure 2C, black trace with Δθ = 45°). Consequently, the network response is relatively simple, as illustrated in Figure 8A for Δθ = 70°. The stimuli activate large number of pyramidal cells with preferred directions between θ1 and θ2, resulting in a unimodal activity profile peaked at ∼125°, which is the average of θ1 = 160° and θ2 = 90°. Therefore, direction judgments are based on vector averaging.

thumbnail
Figure 8. Distinct behavioral regimes during the probabilistic estimation of motion direction.

Network activity can be distinguished based on the difference between the stimulus and microstimulated directions, Δθ. Spatiotemporal firing pattern is superimposed by the time course of the population vector θPV (magenta). The network activity profile at the stimulus offset is shown on the right. The microstimulated direction is always 90°, while the stimulus direction θ1 varies with trials. (A) When Δθ is relatively small (θ1 = 160°), direction estimates are based on vector averaging. (B) For an intermediate Δθ (θ1 = 220°), the network exhibits winner-take-all on some trials (top and middle) and vector averaging on other trials (bottom). (C) For a large Δθ (θ1 = 270°), network activity is predominated by the winner-take-all mechanism. (D) The percentage of trials on which the smaller of |θEθ1| and |θEθ2| (with θE being the direction estimate) is larger than 10° as a function of Δθ.

https://doi.org/10.1371/journal.pcbi.1000253.g008

On the other hand, for Δθ = 180°, the input profile consists of two independent peaks, and two disjoint neural pools are activated. Thus, the network initially exhibits a bimodal activity profile, but the two bumps compete against each other over time (Figure 8C). At stimulus offset, one of the two bumps wins, and θE is close to either θ1 or θ2 (on the first and second trials). On very few trials (15 among 1800 trials), two bumps are visible (on the third trial); nevertheless, θE is still close to either θ1 or θ2. In this sense, direction judgment is determined by winner-take-all for a great Δθ.

For a broad range of intermediate Δθ between 70° and 170°, the input profile has two peaks at θ1 and θ2, but their width and height are not identical. The interaction of a visual stimulus and an artificially elicited directional signal is different from the visual-visual interactions [42]. Figure 8B shows the network activity for Δθ = 130°, similar to the case with Δθ = 110° (Figure 6C). The network behavior evolves based on the winner-take-all competition on some trials, where θE is close to either θ2 (on the first trial) or θ1 (on the second trial). On the other trials, however, two bumps develop initially and are sustained across the trial, in which cases the direction estimate is determined by vector averaging (θE = 120° on the third trial). In other words, direction estimates stochastically switch between the values determined separately by the winner-take-all and vector-averaging mechanisms across trials.

We found that the percentage P of trials on which the direction identification results from vector averaging decreases with increasing Δθ (Figure 8D). P is larger than 80% for Δθ = 80°; but it quickly becomes smaller than 10% for Δθ>100° and smaller than 5% for Δθ>150°. Therefore, for a sufficiently large distance between the two directional signals, the winner-take-all mechanism predominates. This can be explained as follows. The two neural subpopulations selectively responsive to the two input signals are sufficiently separated, so that they do not overlap nor excite each other significantly through localized lateral excitatory connections. Their interaction is mostly through shared feedback inhibition that underlies the winner-take-all competition. Owing to trial-to-trial neuronal fluctuations, however, the net inhibitory interactions may be insufficient to suppress the activity of either subpopulation on some trials, in which cases the direction estimation is determined by vector averaging.

We further quantified the network's decision behavior by plotting the histograms of direction estimates (Figure 9A). For a small Δθ such as 70°, all estimates lie between θ1 and θ2, and the histogram is approximately Gaussian-distributed. For an intermediate Δθ such as 110°, most estimates are close to either θ1 or θ2, but there is also a substantial fraction of estimates in between. Accordingly, the histogram is bimodal. For a large Δθ such as 180°, all estimates lie close to either θ1 or θ2, so that the histogram consists of two narrow and isolated peaks. These results confirm the above conclusion that the network's direction judgments are based on vector averaging when Δθ is small, winner-take-all when Δθ is large, and a mixture of both for intermediate Δθ values.

thumbnail
Figure 9. Winner-take-all versus vector averaging in direction identification.

(A) The distribution of direction estimates on a circle (left) and the corresponding histogram with the binwidth of 5° (right). In each distribution, a wedge is defined by two directions (shown with open squares), separately denoting the median direction estimate for trials where the 80% coherence stimulus is applied alone and for trials where microstimulation is applied together with the 0% coherence stimulus. Three examples are displayed for Δθ = 70°,110°, and 180°, respectively (from top to bottom). Six hundred simulations were performed for each case. (B) The index R as a function of the angular difference between the stimulus and microstimulated directions, Δθ. Pure winner-take-all and vector averaging correspond to R = 1 and 2, respectively. The model displays a mixed strategy (with R between 1 and 2) for direction judgment over a wide range of Δθ values. It also predicts that for a given intermediate Δθ, a longer stimulus viewing time, for instance from 1 s (circle) to 2 s (cross), enhances the preponderance of the winner-take-all regime.

https://doi.org/10.1371/journal.pcbi.1000253.g009

Nichols and Newsome tested the winner-take-all versus vector-averaging coding schemes in the monkey experiment, using a measure called R that is defined as follows [4]. First, the median direction estimate is calculated separately for trials where the motion stimulus with c′ = 80% is presented alone (without microstimulation) and for trials where microstimulation is applied together with the 0% coherence stimulus. These two medians form a wedge (shown for our model in the left half of Figure 9A). R is then defined as the proportion of actual direction estimates (on the trials with both the 80% coherence stimulus and microstimulation) that lie within the wedge, divided by 0.5. As a result, R can be used to quantify the aforementioned three behavioral types. For instance, vector averaging implies that direction estimates lie completely within the wedge, so that R≃1/0.5 = 2. On the other hand, for pure winner-take-all, direction estimates are centered around the two medians, so that R≃0.5/0.5 = 1. R as a function of Δθ is plotted in Figure 9B (open circle). R is close to 2 for small Δθ, whereas it approaches unity when Δθ is close to 180°, similar to the experimental observation (Figure 6 in [4]). Moreover, there is a plateau at R≃1.35 for a range of intermediate Δθ values, a feature also present in the monkey data, which indicates a mixture of the winner-take-all and vector-averaging mechanisms. Note that the R curve is quite similar to the P curve shown in Figure 8D. Therefore, both the two entirely different measures confirm the mixed strategy for direction identification over a wide range of intermediate Δθ values.

We reasoned that when the sensory stimulus and microstimulation provide conflicting signals, time integration may be important to resolve the ambiguity. In neuronal terms, a longer stimulus viewing time should allow one of two bumps in the network activity pattern to evolve to become dominant at the expense of the other. We tested this prediction by computing R under the condition where the motion stimulus lasted 2 s instead of 1 s. Indeed, with a longer stimulus viewing time, R generally becomes lower and is smaller than 1.15 when Δθ≥110° (Figure 9B, cross). This model prediction is testable in future experiments.

Discussion

Growing evidence indicates that in a random-dot motion discrimination task, while MT neurons encode motion directions, perceptual decisions are made downstream, perhaps in the parietal cortex [10],[12],[13],[35],[48],[49] or the prefrontal cortex [17]. Similarly, in a detection task (that requires a ‘yes or no’ binary response) using near-threshold somatosensory stimuli, neural activity in the prefrontal cortex, but not in the primary somatosensory cortex, was found to covary trial-by-trial with the subjective report [50]. What are the microcircuit properties that allow a ‘decision circuit’ to subserve perceptual judgments? We have previously proposed a cortical circuit model endowed with slow reverberatory excitation and feedback inhibition, which allows for the temporal integration of sensory stimuli and the formation of categorical choice [19],[21],[23]. This type of model framework has also been applied to somatosensory discrimination [18],[20],[24] and detection [25]. In the present study, we extended this approach to a continuous recurrent network. Our results suggest that a common cortical circuit can perform both the categorical discrimination and veridical judgment tasks.

Temporal Integration and Categorical Choice in the Discrimination Task

In a two-alternative direction discrimination task, a subject must be instructed what are the discrete choice options by visual targets [13]. In the continuous recurrent network model, we implemented the two targets (at θ1 and θ2) and examined how the network integrates a motion stimulus biased by the targets and makes a categorical choice (θ1 or θ2). In consonance with the previous models with discrete neural pools [19],[21], our model reproduces salient observations of LIP activity in the monkey experiment [13]. First, the population firing rates of two competing neural pools first increase together and then diverge, with one continuing to build up while the other decaying down. Second, cells exhibit ramp-like activity, which is slower at lower motion strength. Third, the activity of the winning pool is higher on correct trials than on error trials, whereas the opposite is true for that of the losing pool. Furthermore, at the behavioral level, our model reproduces the psychometric and chronometric functions as well as the observation that the mean reaction time is longer on error trials than on correct trials [13].

In our model, slow temporal integration is instantiated by reverberatory excitation mediated by NMDARs [19],[21]. This is mainly related to its slow synaptic kinetics. We further tested this mechanism by partially replacing NMDARs with much faster AMPARs at recurrent excitatory synapses. As a result, the network's ability to integrate input signals is significantly reduced and the network's performance also deteriorates (data not shown). Experimentally, it would be interesting to measure whether direction discrimination becomes more impulsive and less accurate when NMDAR antagonists are applied to LIP in behaving monkeys. On the other hand, other slow positive feedback processes, such as short-term synaptic facilitation and those involving specific ion channels, could also contribute to time integration, which remains to be investigated experimentally and theoretically. In sum, we suggest that strong reverberation in a cortical microcircuit should be slow in order to subserve cognitive-type computations.

Recurrent excitation must be balanced by feedback inhibition [34],[51]. Lateral inhibition between neural pools involved in decision computation is consistent with the observation that the microstimulation of one neural pool in LIP not only speeds up the choices in its preferred direction but also slows down the choices in its null direction [49]. Ditterich found that an accumulator model produces reaction time distributions with long right tails, inconsistent with the behavioral data, and that the inclusion of lateral inhibition worsens the problem, resulting in even longer right tails especially at low coherence levels [30]. This is not the case in our model; the decision time distributions, although not Gaussian-distributed, do not show pronounced right tails, similar to those observed experimentally [30]. A distinguishing feature of our nonlinear network model is strong recurrent excitation, which is absent in linear accumulator models. The positive feedback mechanism ultimately leads to an acceleration of ramping neural activity toward a decision bound, preventing excessively long decision times. Indeed, Ditterich showed that the monkey's reaction time distributions can be well fitted by the accumulator model with an additional assumption that the decision bound decreases over time. This is functionally equivalent to a temporally increasing ramping slope, which naturally occurs in our recurrent circuit model.

Mixed Strategy for Probabilistic Estimation of an Analog Stimulus Feature

We also applied the continuous recurrent network model to a direction identification task [4], assuming that the network represents a cortical area like LIP, downstream from MT. In the absence of physiological data, we assumed for the sake of simplicity that the inputs separately representing the motion stimulus and the electrically evoked directional signal sum linearly before being fed into the decision circuit. We also took into account lateral inhibition in MT [52],[53], assuming that the input profile for microstimulation has a Mexian-hat shape, which represents a nonlinear effect.

Since MT neurons are broadly tuned to visual motion signals, an important issue is how to link MT activity profile to subjects' percept. A number of studies have explored decoding strategies that the brain might use when there are two coexisting competing signals, each activating a different pool of MT neurons [42],[54],[55]. Nichols and Newsome inferred from the monkey's behavioral performance that different decoding schemes might be used when the angular distance between the direction signals is smaller or larger than 140° [4]. MT neurons with nearly opposite direction preferences appeared to compete to determine the monkey's percept, as predicted by winner-take-all; whereas MT neurons with preferred directions as different as 140° could cooperate to influence the monkey's percept, consistent with vector averaging or other distributed coding.

In our decision circuit, which is downstream from MT, direction judgments are based on the activity of all neurons. That is, we always use the population vector for direction estimation, and such estimates are in good agreement with the behavioral data. Nevertheless, when the stimulus and microstimulated directions are separated by a sufficiently large distance, direction judgments naturally instantiate winner-take-all, whereas when they are close to each other, direction judgments are consistent with vector averaging.

Interestingly, for the two directions with an intermediate angular distance, the network displays a “mixed strategy”, i.e., perceptual estimates are produced by winner-take-all on some trials and by vector averaging on the other trials. A prediction is that within this mixed-strategy regime, quick responses are based on vector averaging, whereas a longer integration of conflicting signals is more likely to yield a winner-take-all based categorical choice. Such temporal tradeoff should be observable at the level of neural activity. These specific model predictions can be tested in future experiments.

Readout of Direction Judgments by Neurons Downstream from LIP

In the present work, we used a simple method (i.e., the population-vector analysis) to read out a direction estimate on each trial. In the future, it would be worthwhile to explicitly examine the neural circuit mechanism underlying the readout process. While cortical areas like LIP may be critically involved in accumulating information and making choices, the actual saccadic response that signals the monkey's decision is produced downstream. For instance, neurons in the superior colliculus, a command center for saccadic eye movements, respond to both the targets and the random-dot motion stimulus in the direction discrimination task [56]. It has been proposed that burst firing of movement neurons in the superior colliculus may be triggered when the synaptic excitation from ramping cortical neurons exceeds a threshold, thereby providing a cellular basis for a decision bound [22]. It will be worth exploring whether the superior colliculus circuit provides additional mechanisms that contribute to readout of perceptual decisions.

In fact, we have already developed an extended model in which a second circuit (that mimics the superior colliculus) receives synaptic input from the decision circuit and can generate a burst of activity signaling a saccade. This is essentially a continuous network version of the cortico-superior colliculus model (with four discrete neural pools) [22]. In this double-ring model, it is natural to read out direction estimates without assuming the threshold crossing of neural firing rates. Preliminary data (not shown) suggest that this extension does not significantly alter the conclusions drawn in this paper.

Comparison with Other Models

It is worth noting that the continuous recurrent network model is adequate for the simulation of two perceptual decision tasks. In both tasks, the decision is about the coherent motion direction, which is a one-dimensional feature. In our network, each neuron has a preferred motion direction to which it is most sensitive. When the readout of direction judgments is based on population vector, downstream neurons will pool the activity of LIP neurons to produce a directional signal for saccadic eye response. Compared with the previous spiking network models on perceptual discrimination [19],[21], which have discrete (usually two) neural pools rather than a continuous network like ours, our work represents a distinct advance in the field. It would be rather straightforward to extend this one-dimensional model to a two-dimensional network model. For example, a two-dimensional firing-rate model for saccadic action selection (not perceptual decisions) has been proposed in [57]. However, computer simulations of such spiking neural circuits are computationally costly, especially for stochastic decision tasks where thousands of trials are required to gather necessary statistics under each condition (just as in the monkey experiments).

In this work, we have focused on the reaction-time version of the categorical discrimination task, in which a simulated trial is terminated when either of two population firing rates first reaches a threshold, and the corresponding choice and decision time are recorded. In the direction identification task, the response signaling a veridical judgment is produced at the offset of the visual stimulus presentation, as in the experiment of Nichols and Newsome [4]. Neither of the task paradigms involves working memory, and we did not specifically simulate the fixed-duration version of the discrimination task [12],[13].

While our model was based on that designed for spatial working memory [34], we changed some parameter values to reproduce comparable behavioral data from the monkey experiments (such as the psychometric and chronometric functions for the discrimination task and the R plot for the identification task). Interestingly, with this new set of parameter values, the network does not exhibit self-sustained persistent activity. This is at variance with our previous work using a model with discrete neural pools [19],[21]. In the future, it would be interesting to use the same model to simulate the fixed-duration version of the categorical discrimination task (where two targets exist throughout the trial) and analyze systematically to what extent the ability to carry out decision computation depends on the working memory capacity in the continuous network model (as we have previously done with the discrete model [21]).

A Cognitive-Type Cortical Circuit Capable of Performing Multiple Functions

A continuous recurrent network model, which was originally developed for mnemonic delay-period activity in spatial working memory [34], has been elaborated in several ways [58][61]. Direction-selective persistent neural activity has been observed in both the prefrontal [62] and the posterior parietal cortex [63]. We argue that a cognitive-type cortical circuit like the parietal or prefrontal cortex is equipped with strongly recurrent connectivity to subserve both internal representation of information and dynamic decision computations. On the other hand, it is still unclear to what extent a network's capacity of decision computations and that of working memory necessarily depend on each other. Conceivably, top-down control signals could adaptively modulate a cortical circuit such as LIP, so that it can operate in different dynamical regimes to fulfill different computational demands. Regardless, the present work, by demonstrating that a single cortical circuit is able to perform the veridical judgment and categorical discrimination tasks, represents a significant step toward uncovering the circuit and neurodynamical underpinnings of cognition.

Acknowledgments

We thank Alexander C. Huk for comments on a previous version of this paper.

Author Contributions

Conceived and designed the experiments: FL XJW. Performed the experiments: FL. Analyzed the data: FL XJW. Wrote the paper: FL XJW.

References

  1. 1. Luce RD (1986) Response times: Their role in inferring elementary mental organization. New York: Oxford University Press.
  2. 2. Jazayeri M, Movshon JA (2006) Optimal representation of sensory information by neural populations. Nat Neurosci 9: 690–696.
  3. 3. Cook EP, Maunsell JHR (2002) Dynamics of neuronal responses in macaque MT and VIP during motion detection. Nat Neurosci 5: 985–994.
  4. 4. Nichols MJ, Newsome WT (2002) Middle temporal visual area microstimulation influences veridical judgments of motion direction. J Neurosci 22: 9530–9540.
  5. 5. Britten KH, Shadlen MN, Newsome WT, Movshon JA (1992) The analysis of visual motion: A comparison of neuronal and psychophysical performance. J Neurosci 12: 4745–4765.
  6. 6. Parker AJ, Newsome WT (1998) Sense and the single neuron: Probing the physiology of perception. Annu Rev Neurosci 21: 227–277.
  7. 7. Romo R, Salinas E (2001) Touch and go: Decision-making mechanisms in somatosensation. Annu Rev Neurosci 24: 107–137.
  8. 8. Schall JD (2001) Neural basis of deciding, choosing and acting. Nat Rev Neurosci 2: 33–42.
  9. 9. Sugrue LP, Corrado GS, Newsome WT (2005) Choosing the greater of two goods: Neural currencies for valuation and decision making. Nat Rev Neurosci 6: 363–375.
  10. 10. Gold JI, Shadlen MN (2007) The neural basis of decision making. Annu Rev Neurosci 30: 535–574.
  11. 11. Wang X-J (2008) Decision making in recurrent neuronal circuits. Neuron 60: 215–234.
  12. 12. Shadlen MN, Newsome WT (2001) Neural basis of a perceptual decision in the parietal cortex (area LIP) of the rhesus monkey. J Neurophysiol 86: 1916–1936.
  13. 13. Roitman JD, Shadlen MN (2002) Response of neurons in the lateral intraparietal area during a combined visual discrimination reaction time task. J Neurosci 22: 9475–9489.
  14. 14. Hernández A, Zainos A, Romo R (2002) Temporal evolution of a decision making process in medial premotor cortex. Neuron 33: 959–972.
  15. 15. Romo R, Hernández A, Zainos A, Lemus L, Brody CD (2002) Neuronal correlates of decision-making in secondary somatosensory cortex. Nat Neurosci 5: 1217–1225.
  16. 16. Romo R, Hernández A, Zainos A (2004) Neuronal correlates of a perceptual decision in ventral premotor cortex. Neuron 41: 165–173.
  17. 17. Kim JN, Shadlen MN (1999) Neural correlates of a decision in the dorsolateral prefrontal cortex of the macaque. Nat Neurosci 2: 176–183.
  18. 18. Machens CK, Romo R, Brody CD (2005) Flexible control of mutual inhibition: A neural model of two-interval discrimination. Science 18: 1121–1124.
  19. 19. Wang X-J (2002) Probabilistic decision making by slow reverberation in cortical circuits. Neuron 36: 955–968.
  20. 20. Miller P, Wang X-J (2006) Inhibitory control by an integral feedback signal in prefrontal cortex: A model of discrimination between sequential stimuli. Proc Natl Acad Sci U S A 103: 201–206.
  21. 21. Wong KF, Wang X-J (2006) A recurrent network mechanism of time integration in perceptual decisions. J Neurosci 26: 1314–1328.
  22. 22. Lo CC, Wang X-J (2006) Cortico-basal ganglia circuit mechanism for a decision threshold in reaction time tasks. Nat Neurosci 9: 956–963.
  23. 23. Wong KF, Huk AC, Shadlen MN, Wang X-J (2007) Neural circuit dynamics underlying accumulation of time-varying evidence during perceptual decision-making. Front Comput Neurosci 1: 6.
  24. 24. Deco G, Rolls ET (2006) Decision-making and Weber's law: A neurophysiological model. Eur J Neurosci 24: 901–916.
  25. 25. Deco G, Prez-Sanagustn M, de Lafuente V, Romo R (2007) Perceptual detection as a dynamical bistability phenomenon: A neurocomputational correlate of sensation. Proc Natl Acad Sci U S A 104: 20073–20077.
  26. 26. Ratcliff R (1978) A theory of memory retrieval. Psychol Rev 85: 59–108.
  27. 27. Usher M, McClelland J (2001) On the time course of perceptual choice: The leaky competing accumulator model. Psychol Rev 108: 550–592.
  28. 28. Smith PL, Ratcliff R (2004) Psychology and neurobiology of simple decisions. Trends Neurosci 27: 161–168.
  29. 29. Mazurek ME, Roitman JD, Ditterich J, Shadlen MN (2003) A role for neural integrators in perceptual decision making. Cereb Cortex 13: 1257–1269.
  30. 30. Ditterich J (2006) Evidence for time-variant decision making. Eur J Neurosci 24: 3628–3641.
  31. 31. Seung HS, Sompolinsky H (1993) Simple models for reading neuronal population codes. Proc Natl Acad Sci U S A 90: 10749–10753.
  32. 32. Pouget A, Dayan P, Zemel RS (2003) Inference and computation with population codes. Annu Rev Neurosci 26: 381–410.
  33. 33. Seriès P, Latham PE, Pouget A (2004) Tuning curve sharpening for orientation selectivity: Coding efficiency and the impact of correlations. Nat Neurosci 7: 1129–1135.
  34. 34. Compte A, Brunel N, Goldman-Rakic PS, Wang X-J (2000) Synaptic mechanisms and network dynamics underlying spatial working memory in a cortical network model. Cereb Cortex 10: 910–923.
  35. 35. Huk AC, Shadlen MN (2005) Neural activity in macaque parietal cortex reflects temporal integration of visual motion signals during perceptual decision making. J Neurosci 25: 10420–10436.
  36. 36. Salzman CD, Britten KH, Newsome WT (1990) Cortical microstimulation influences perceptual judgments of motion direction. Nature 346: 174–177.
  37. 37. Goldman-Rakic PS (1995) Cellular basis of working memory. Neuron 14: 477–485.
  38. 38. Wang X-J (1999) Synaptic basis of cortical persistent activity: The importance of NMDA receptors to working memory. J Neurosci 19: 9587–9603.
  39. 39. Britten KH, Newsome WT (1998) Tuning bandwidths for near-threshold stimuli in area MT. J Neurophysiol 80: 762–770.
  40. 40. Albright TD (1984) Direction and orientation selectivity of neurons in visual area MT of the macaque. J Neurophysiol 52: 1106–1130.
  41. 41. Britten KH, Shadlen MN, Newsome WT, Movshon JA (1993) Responses of neurons in macaque MT to stochastic motion signals. Vis Neurosci 10: 1157–1169.
  42. 42. Treue S, Hol K, Rauber HJ (2000) Seeing multiple directions of motion-physiology and psychophysics. Nat Neurosci 3: 270–276.
  43. 43. Georgopoulos AP, Schwartz AB, Kettner RE (1986) Neuronal population coding of movement direction. Science 233: 1416–1419.
  44. 44. Hansel D, Mato G, Meunier C, Neltner L (1998) On numerical simulations of integrate-and-fire neural networks. Neural Comp 10: 467–483.
  45. 45. Sato T, Schall JD (2001) Pre-excitatory pause in frontal eye field responses. Exp Brain Res 139: 53–58.
  46. 46. Li XB, Kim B, Basso MA (2006) Transient pauses in delay-period activity of superior colliculus neurons. J Neurophysiol 95: 2252–2264.
  47. 47. Quick RF (1974) A vector-magnitude model of contrast detection. Kybernetik 16: 65–67.
  48. 48. Shadlen MN, Newsome WT (1996) Motion perception: Seeing and deciding. Proc Natl Acad Sci U S A 93: 628–633.
  49. 49. Hanks TD, Ditterich J, Shadlen MN (2006) Microstimulation of macaque area LIP affects decision-making in a motion discrimination task. Nat Neurosci 9: 682–689.
  50. 50. de Lafuente V, Romo R (2005) Neuronal correlates of subjective sensory experience. Nat Neurosci 8: 1698–1703.
  51. 51. Brunel N, Wang X-J (2001) Effects of neuromodulation in a cortical network model of object working memory dominated by recurrent inhibition. J Comput Neurosci 11: 63–85.
  52. 52. Heeger DJ, Simoncelli EP, Movshon JA (1996) Computational models of cortical visual processing. Proc Natl Acad Sci USA 93: 623–627.
  53. 53. Ardid S, Wang X-J, Compte A (2007) An integrated microcircuit model of attentional processing in the neocortex. J Neurosci 27: 8486–8495.
  54. 54. Salzman CD, Newsome WT (1994) Neural mechanisms for forming a perceptual decision. Science 264: 231–237.
  55. 55. Groh JM, Born RT, Newsome WT (1997) How is a sensory map read out? Effects of microstimulation in visual area MT on saccades and smooth pursuit eye movements. J Neurosci 17: 4312–4330.
  56. 56. Horwitz GD, Newsome WT (2001) Target selection for saccadic eye movements: Prelude activity in the superior colliculus during a direction-discrimination task. J Neurophysiol 86: 2543–2558.
  57. 57. Wilimzig C, Schneider S, Schöener G (2006) The time course of saccade decision making: Dynamic field theory. Neural Netw 19: 1059–1074.
  58. 58. Tegnér J, Compte A, Wang X-J (2002) The dynamical stability of reverberatory neural circuits. Biol Cybern 87: 471–481.
  59. 59. Renart A, Song P, Wang X-J (2003) Robust spatial working memory through homeostatic synaptic scaling in heterogeneous cortical networks. Neuron 38: 473–485.
  60. 60. Wang X-J, Tegnér J, Constantinidis C, Goldman-Rakic PS (2004) Division of labor among distinct subtypes of inhibitory neurons in a cortical microcircuit of working memory. Proc Natl Acad Sci USA 101: 1368–1373.
  61. 61. Carter E, Wang X-J (2007) Cannabinoid-mediated disinhibition and working memory: Dynamical interplay of multiple feedback mechanisms in a continuous attractor model of prefrontal cortex. Cereb Cortex 17: Supplement 1i16–i26.
  62. 62. Funahashi S, Bruce CJ, Goldman-Rakic PS (1989) Mnemonic coding of visual space in the monkey's dorsolateral prefrontal cortex. J Neurophysiol 61: 331–349.
  63. 63. Gnadt JW, Andersen RA (1988) Memory related motor planning activity in posterior parietal cortex of macaque. Exp Brain Res 70: 216–220.