^{*}

Conceived and designed the experiments: CE DM. Performed the experiments: CE DM. Analyzed the data: DM. Contributed reagents/materials/analysis tools: CE. Wrote the paper: CE DM.

The authors have declared that no competing interests exist.

A central criticism of standard theoretical approaches to constructing stable, recurrent model networks is that the synaptic connection weights need to be finely-tuned. This criticism is severe because proposed rules for learning these weights have been shown to have various limitations to their biological plausibility. Hence it is unlikely that such rules are used to continuously fine-tune the network

Persistent neural activity is typically characterized as a sustained increase in neural firing, sometimes lasting up to several seconds, and usually following a brief stimulus. It has been thought to underlie a wide variety of neural computations, including the integration of velocity commands

However, as demonstrated by

Consequently, it is an open problem as to how real neurobiological systems produce the observed stability. The most direct answer to this question – that there are learning mechanisms for fine-tuning – has also seemed implausible. Several models that have adopted such an approach require a retinal slip signal in order to tune the integrator

Here we propose a learning rule that is able to account for available plasticity results, while being biologically plausible. Specifically, we demonstrate that our proposed rule: 1) fine-tunes the connection weights to values able to reproduce experimentally observed behavior; 2) explains the mis-tuning of the neural integrator under various conditions; and 3) relies only on known inputs to the system. We also suggest a generalization of this rule that may be exploited by a wide variety of neural systems to induce stability in higher-dimensional spaces, like those possibly used in the head-direction and path integration systems in the rat

To understand the results and genesis of the proposed learning rule, it is useful to begin with a standard theoretical characterization of an attractor network. The “optimal” neural integrator model used in this study is constructed using the Neural Engineering Framework (NEF) methods described in

For simplicity, each neuron in the integrator is modeled as a spiking leaky integrate-and-fire (LIF) neuron, though little depends on this choice of neuron model

The interactions between neurons are captured by allowing spikes generated by neurons to elicit post-synaptic currents (PSCs) in the dendrites of neurons to which they project. The PSCs are modeled as exponentially decaying with a time constant of

The total current flowing into the soma of a receiving cell from the dendrites,

a) The dynamics of a model neuron coupled to a PSC model provides the complete model of a single cell. Spikes arrive, are filtered by a weighted post-synaptic current and then drive a spiking nonlinearity. b) Tuning curves for 40 simulated goldfish neurons with a cellular membrane time constant,

To use this cellular model to perform integration it is essential to determine the appropriate recurrent connection weights

In both mammals and fish, the relevant network of cells receives projections from earlier parts of the brain that provide a velocity command to update eye position. In addition, many of the cells in the network are connected to one another, making it naturally modeled as a recurrent network. This network turns the velocity command into an eye position command, and projects the result to the motor neurons which directly affect the relevant muscles. Thus, our model circuit consists of one population of recurrently connected neurons, which receives a velocity input signal

To construct the model, we begin with an ensemble of 40 neurons (approximately the number found in the goldfish integrator), which have firing curves randomly distributed to reflect known tuning in the goldfish

To determine what aspects of that information are available to subsequent neurons from this activity (i.e., to determine what is represented), we need to find a decoder

Optimal linear decoders can be found by minimizing the difference between the represented eye position

For the neural integrator model, it is also essential to determine how to recurrently connect the population to result in stable dynamics.

A network with these recurrent weights will attempt to hold the present representation of eye position as long as there is no additional input. However, even given optimal weights there are many reasons that the eye position will drift. These include representational error introduced by the nonlinearities in the encoding, fluctuations in the representation of eye position, due to the non-steady nature of filtered spike trains, and the many sources of noise attributed to neural systems

Note also that this network will mathematically integrate its input. If we inject additional current into the neural population, it acts as an extra change in the eye position, and will be added to the representation of eye position. Additional input will thus be summed over time (i.e., integrated) until it stops, at which point the system will attempt to hold the new representation of eye position. In short, an input proportional to eye velocity will be integrated to drive the circuit to a new eye position. The stable representation of eye position by this circuit for different velocity inputs is discussed in the

To complete our discussion of the optimal neural integrator, in this section we describe the methods used to compute optimal linear decoders

Plasticity in the neural integrator is evident across a wide variety of species, and there is strong evidence that modification of retinal slip information is able to cause the oculomotor integrator to become unstable or damped

The goal of this study is to determine a biologically plausible learning rule that is able to perform integration as well as the linear optimal network described above. The learning rule derived here is based on the idea that integrators should be able to exploit the corrective input signals they receive. Empirical evidence indicates that all input at the integrator itself is in the form of velocity commands

In the oculomotor integrator, there is evidence of two classes of input:

a) Eye position for a series of saccades. b) The saccade velocity, based on a). c) Filtering based on magnitude. This method uses Equation 15 to filter the velocity profile. This is the method adopted for all subsequent experiments. d) Filtering based on a change in position, where a change in position greater than 5 degree allows the subsequent velocity commands to pass through at a magnitude inversely proportional to the time elapsed after a movement.

Furthermore

Nevertheless, retinal slip plays an important role in the overall system. In most models of the oculomotor system, including the one we adopt below, corrective saccades are generated on the basis of retinal slip information. If the retinal image is moving, but there have been no self-generated movements (i.e., the retinal image is “slipping”), the system will generate corrective velocity commands to eliminate the slip. Consequently, the integrator itself has only indirect access to retinal slip information. Below, we show that this is sufficient to drive an appropriate learning rule.

Before turning to the rule itself, it is useful to first consider what is entailed by the claim that the system must be finely tuned. An integrator is able to maintain persistent activity when the sum of current from feedback connections is equal to the amount of current required to exactly represent the eye position in an open loop system. If the eye position representation determined by the feedback current and the actual eye position are plotted on normalized axes, the mapping for a perfect integrator would define a line of slope 1 though the origin (see

Eye position is normalized to lie on a range of

However, if the magnitude of the feedback is less than what is needed, the represented eye position will drift towards zero. This is indicated by the slope of the system transfer function being less than 1. Such systems are said to be dynamically damped. Conversely, if the feedback is greater than needed, the slope of the transfer function is greater than 1 and the system output will drift away from zero. Such systems are said to be dynamically unstable (see

As described earlier, the representation of eye position given by equation 8 has a definite error (for the neurons depicted in

Given this background, it is possible to derive a learning rule that minimizes the difference between the neural representation of eye position

Importantly, it is now possible to substitute for the bracketed term using the negative of the corrective saccade. This substitution can be made because

Unfortunately, this rule is neither in terms of the connection weights of the circuit, nor local. These two concerns can be alleviated by multiplying both sides of the expression by the encoder and gain of neurons

Second, the right-hand side of Equation 17 is in a pseudo-Hebbian form: there is a learning rate

However, the current and the activity are highly correlated, as the

Finally, it should be noted that the integrator subject to this rule is driven by all velocity inputs as usual. Both corrective and intentional saccades determine the firing of the neurons in the integrator, and are integrated by the circuit. The mechanism that distinguishes these two kinds of saccades (

Overall, the resulting rule is biologically plausible, using only information available to neuron

There have been similar learning rules proposed in the literature. For example

More generally, there has been a wide variety of work examining Hebbian-like reinforcement learning (also called reward modulated Hebbian learning) that propose rules with a similar mathematical form to Equation 17

However, we can extend past work by taking advantage of the NEF decomposition used in the derivation of the previous rule. In particular, the decomposition makes it clear how we can generalize the simple rule we have derived from learning scalar functions to learning arbitrary vector functions. Consider a derivation analogous to that above, which directly replaces encoding and decoding weights (

The encoding vector

Previous learning models of the oculomotor integrator

The OMS model contains saccadic, smooth pursuit, and fixation subsystems controlled by an internal monitor. The model uses retinal signals and an efferent copy of the motor output signals to generate motor control commands. It includes the simulation of plant dynamics, and has parameters to simulate normal ocular behavior as well as several disorders. For this study, all parameters were set for normal, healthy ocular behavior.

To test our learning algorithm, we replaced the neural integrator of the OMS model with the spiking integrator model described above. To compare the tuning of our network to the experimental results of

The neural integrator in this study was constructed in Simulink and embedded into the OMS model. The OMS model is available at

The learning rule used a value of

To appropriately characterize the behavior of the model, each simulation experiment consisted of running 30 trials each with a different, randomly generated network, allowing the collection of appropriate statistics. For each trial, a new set of tuning curves for the neurons, and a new set of input functions, were randomly generated. The parameters of the tuning curves were determined based on an even distribution of

Ten different experiments were run in this manner. The first was the linear optimal integrator described above. The connections between the neurons in the linear optimal network are defined by Equation 9. All subsequent experiments start from these weights unless otherwise specified.

Several experiments add noise to the connection weights of the linear optimal integrator over time. Noise was added to the connection weight matrix

In experiment 2,

The third experiment consisted of allowing the learning rule to operate on the connection weights of the integrator networks from experiment 2. That is, after being run with the above noise and no learning for 1200 s (resulting in 30% noise), the learning rule (and no additional noise) was run for 1200 s. The fourth experiment allowed the integrator to learn while noise was continuously added to the original optimal network weights. Noise was added in the same manner as equation 19, but concurrently with learning. In this case

Experiments seven and eight were run to reproduce the results of

The ninth and tenth experiments demonstrate that the rule is able to account for recovery from lesions

Two benchmarks were used to quantify the performance of the neural integrator in these experiments. The first was root-mean-square error (RMSE) between the plot of actual feedback and the exact integrator (i.e., a line of slope 1 through the origin). This is determined by comparing the represented eye position for each possible input to the actual position given that input, and taking the difference. This provides an estimate of the representational error caused by one forward pass through the neural integrator. As a result, this error is measured in degrees. The lower this error, the slower the integrator will drift over time on average.

The second measure was the time constant,

Data was collected for 30 randomly generated networks (i.e., neuron parameters are randomly chosen as described above) and used to calculate a mean and 95% confidence interval (using bootstrapping with 10,000 samples) for both RMSE and

To demonstrate the effectiveness of the proposed learning rule (equation 17), we present the results of the ten experiments in order to benchmark the system and reproduce a variety of plasticity observations in the oculomotor system.

The summary results of the ten experiments are shown in

a) RMSE and b) the magnitude of

RMSE (degrees) | |||||

Experiment | Mean | CI | Mean | CI | |

1 | Optimal | 0.129 | 0.115–0.138 | (+) 41.4 | 31.2–55.6 |

2 | Noisy | 2.156 | 1.693–2.699 | (+) 10.6 | 5.85–18.2 |

3 | Learned+Perturb |
0.671 | 0.312–1.178 | (+) 98.7 | 58.5–153 |

4 | Learned+Noise |
0.712 | 0.595–0.854 | (−) 31.6 | 13.5–60.1 |

5 | Learned+Perturb+Noise |
1.120 | 0.606–1.838 | (+) 41.4 | 18.9–78.8 |

6 | Learned+NoNoise |
0.183 | 0.170–0.193 | (+) 122 | 88.1–165 |

7 | Unstable | 0.382 | 0.364–0.395 | (−) 15.5 | 13.8–17.1 |

8 | Damped | 0.313 | 0.294–0.329 | (+) 10.9 | 9.19–13 |

9 | Lesion | 0.824 | 0.561–1.142 | (+) 30.8 | 20.2–46.2 |

10 | Recovery | 0.513 | 0.359–0.716 | (−) 51.3 | 25.4–88.1 |

After an initial disturbance (30%) to connection weights.

With continuous noise (10%) added to connection weights.

After an initial disturbance (30%) and continuous noise (5%).

No noise.

The root-mean-squared error (RMSE), measured in degrees, quantifies the average difference between the exact integrator transfer function (a straight line) and the estimated transfer function of the model circuit (as described in

The four experiments in which the system learns under a variety of noise profiles demonstrate the robustness of the rule. As is evident from

The linear Optimal, Noisy (30% perturbation to connection weights), and Learned+Perturb

The linear Optimal network is closer to the exact integrator over the range of eye positions. Although deviations of the Noisy network from the exact integrator are small, the effects on stability are highly significant (see

In fact, as shown in

Consequently we consider the rule under continuous noise. With the continuous addition of 10% noise (Learned+Noise

In the case of combined initial and continuous noise (Learned+Perturb+Noise

Taken together, these results suggest that the learning rule is as good as the optimization at generating and fine-tuning a stable neural integrator. In fact, with no noise (Learned+NoNoise

The results can also be compared to the goldfish integrator, which has empirically measured time constants that range between 29 s and 95 s, with a mean of 66 s

A single raw recording is shown on the left, along with the corresponding eye trace. Arrows indicate times of saccade (black right, grey left; adapted from

The results from the unstable and damped experiments reproduce the major trends observed in the experimental results, as shown in

The top trace is for the control situation, which for the model is tuning after a 30% perturbation and 5% continuous noise. The middle trace shows the unstable integrator, and the bottom trace shows the damped integrator. The goldfish traces are from animals that had longer training times (6 h and 16.5 h respectively), than the model (20 min). Both the model and experiment demonstrate increased detuning with longer training times (not shown), and both show the expected detuning (drift away from midline for the unstable case, and drift towards midline in the damped case).

Simulation | Empirical Data | |

Experiment | (20 min training) | (1 h or more training) |

6 Learned+Perturb |
41.4 | 66.0 |

7 Unstable | 15.1 | 4.3 |

8 Damped | 10.9 | 7.7 |

The slope of the Unstable network is greater than 1 and that of the Damped network is less than 1. The re-tuned networks demonstrate the expected drifting behavior (see

One noticeable difference between the experiments and simulations is the variability in the system after training. While the standard deviations for the experimental results are not available, the range of one correctly tuned experiment is reported as being from −31 s to 15 s

To simulate the lesion of a single neuron, the network was tuned to the linear optimal weights before a single neuron was removed. Lesioning a neuron resulted in an increase in RMSE from 0.129 to 0.824 and a decrease in time constant to about 10 s. To demonstrate the recovery process documented by

Severe drift is evident after randomly removing one of the 40 neurons. After 1200 s of recovery with the learning rule under 5% noise, the time constant improves back to pre-lesioning levels.

In other work, we have shown how this characterization of the oculomotor integrator as a line attractor network can be generalized to the family of attractor networks including ring, plane, cyclic, and chaotic attractors

For example, the ring attractor is naturally characterized as a stable function attractor (where the stabilized function is typically a “bump”), as opposed to the scalar attractor of the oculomotor system. Similarly, a 2D bump attractor, which has been used by several groups to model path integration in rat subiculum

a) Gaussian-like tuning curves of 20 example neurons in a one-dimensional function space (7-dimensional vector space). These are tunings representative of neurons in a head-direction ring attractor network. b) Multi-dimensional Gaussian-like tuning curves of four example neurons in a two-dimensional function space (14-dimensional vector space). These are tunings representative of neurons in a subicular path integration network.

Analogous simulations to the oculomotor Learned+Perturb

a) The input (dashed line) along with the final position of the representation after 500 ms of drift for pre-training (thick line) and post-training (thin line). b) The pre-training drift in the vector space over 500 ms at the beginning of the simulation for the bump (thick line in a). d) The drift in the vector space over 500 ms after 1200 s of training in the simulation (thin line in a). Comparing similar vector dimensions between b) and c) demonstrates a slowing of the drift. d) A 2D bump in the function space for the simulated time shown in e), after training. e) The vector drift in the 14-dimensional space over 500 ms after training.

As shown in

These simulations are intended only as a proof-in-principle that the learning rule generalizes, and are clearly inaccurate regarding the biological details of both systems (e.g., neuron parameters should not be the same as the oculomotor integrator). More importantly, the generalized error needed in each simulation

The simulations described in this paper demonstrate one possible solution to the problem of fine-tuning in neural integrators. The oculomotor model was able to achieve and maintain finely-tuned connection weights through a biologically plausible learning algorithm. Specifically, the learning rule allowed recovery from large perturbations of connection weights, continuous perturbation of connection weights, and the lesioning of cells. Not surprisingly, these results are in agreement with other experimental findings that suggest that feedback plays an important role in the behavior of the oculomotor integrator

Consideration of the learning rule suggested here demonstrates that on-line fine-tuning is a viable

Using feedback to tune the integrator results in learned connection weights that produce the same or even longer time constants than the theoretically derived linear optimal connection weights, despite a significantly larger RMSE (compare experiments one, three, four, five, and ten). This is likely because the calculation of linear optimal weights does not account for dynamics of the eye or the spiking non-linearities in the neurons (see

In contrast, the learning algorithm is employed alongside the simulation of the oculomotor plant and single cell dynamics, so the learned weights are calculated for a more complete model rather than an approximation to that model. The effect of these approximations is most directly demonstrated by experiment six, in which the learning rule tunes the system with no noise. In this case, the average learned time constant is three times longer than that of the linear optimal network, even though the RMSE is higher as well. This is true regardless of how much noise is assumed during the optimization process (results not shown). This suggests that typical theoretical methods for tuning connection weights are not generally “optimal” in fully spiking network models.

Despite the limitations of these theoretical optimization methods, they are important for allowing the network to be in a neighbourhood where it can be fine-tuned. This rule will not tune a completely random network with large amounts of continuous noise, for instance. As a result, one empirically testable consequence of this model is a characterization of the maximum amount of noise such a mechanism can tolerate. In particular, the system is robust under 10% continuous noise, or under 30% initial and 5% continuous noise. This makes it reasonable to expect that the amount of continuous noise of this type in the system would be on the order of 5–10% (over twenty minutes). While this degree of robustness is significant, it remains to be seen how robust the biological integrator is to these same kinds of perturbation, and how severe intrinsic perturbations in the system are. Given our model, we suggest that the magnitude of intrinsic perturbations could be determined by examining the extent and speed of detuning when corrective saccades are inhibited or removed. For instance, under the same 10% continuous noise for 200 minutes with no corrective saccades, the average system time constant becomes 7.68 s (confidence interval: 4.67 s–11.8 s) in the model. We leave for future consideration careful characterization of the relationship between continuous noise, one-shot noise, learning rates, and the absence of corrective saccades.

It can also be noted that the speed with which the model converges to stability is a function of the learning rate,

A related empirical question that arises given this model is: How are corrective and intentional saccades distinguished? In the model, that distinction is made by filtering based on the magnitude of the velocity command. However, it remains an open question what the biological mechanism underlying this filtering might be. This issue is left largely unaddressed here because there are several potential means of identifying corrective saccades. For example, the learning process may require a kind of “activation energy” to initiate learning, in which case large saccades would reduce this energy and act as inhibitors for learning. It is also possible that the (amplitude independent) frequency content of saccades is used to trigger the learning process, such that intentional saccades do not cause modification of the synaptic weights. As well, the duration of the saccades can be used as a means of distinguishing intentional from corrective saccades. In the end, the magnitude filtering implemented in this study was chosen because of simplicity and lack of experimental evidence for any one of these potential mechanisms.

Notably, our particular choice of filtering method does not seem crucial. We have run single simulations with other filtering methods with similar results. For example, using the filtering by change in position (see

Consideration of the generalized learning rule raises interesting possibilities that could be tested experimentally. Perhaps most speculatively, the rule suggests that intrinsic neuron properties play a central role in a how a particular neuron is exploited by a system. The encoding vector

Much less speculatively, the general structure of the rule suggests simple behavioral experiments. For example, if an error signal analogous to retinal slip is available to head-direction, path integration systems, or working memory systems, it should be possible to similarly mis-tune those systems with careful manipulation of the stimulus. If such mis-tuning is achievable, it would suggest that this kind of plasticity is broadly important for the neural control of behavior.

Returning to the saccadic system specifically, it is evident that the error signal is generated by elements of the oculomotor system external to the integrator itself. However, it is clearly the case that such a signal is self-generated by the neurobiological system as a whole (as captured by the OMS model). This signal allows for a kind of “self-directed organization” of the system. The generalization suggests that any other error signal that can be self-generated can also be exploited by this rule for tuning a network to perform other kinds of computations. Preliminary results show that this generalized rule is able to learn arbitrary non-linear vector transformations

In addition, clear differences in the consequences of different kinds of learning arise in the case of stability. A supervised rule, such as backpropagation through time

We thank T. Stewart, M. Hurwitz, D. Rasmussen, T. Bekolay, X. Choo, C.H. Anderson, and Y. Tang for helpful discussions, comments, and technical support.