^{1}

^{2}

^{1}

^{2}

^{3}

^{*}

The authors have declared that no competing interests exist.

Conceived and designed the experiments: JZ. Performed the experiments: JZ. Analyzed the data: JZ. Wrote the paper: JZ MRD.

The sparse coding hypothesis has enjoyed much success in predicting response properties of simple cells in primary visual cortex (V1) based solely on the statistics of natural scenes. In typical sparse coding models, model neuron activities and receptive fields are optimized to accurately represent input stimuli using the least amount of neural activity. As these networks develop to represent a given class of stimulus, the receptive fields are refined so that they capture the most important stimulus features. Intuitively, this is expected to result in sparser network activity over time. Recent experiments, however, show that stimulus-evoked activity in ferret V1 becomes

The popular sparse coding theory posits that the receptive fields of visual cortical neurons maximize the efficiency of the neural representation of natural images. Models implementing this idea typically minimize a combination of the error in reconstructing natural images from neural activities, and the average level of activity in the model neurons. In simulations, these models are presented with natural images and the RFs then develop so as to increase representation efficiency. After a long developmental period, the model RFs typically agree well with those observed experimentally in visual cortex. Since the models seek to minimize (for a given level of reconstruction error) the neural activity levels, the average levels of neural activity might be expected to decrease as the models develop. In the developing mammalian cortex, visual RFs are also modified during development, so the sparse coding hypothesis might appear to suggest that activity levels should decrease during development. Recent experiments with young ferrets show the opposite trend: mature animals tend to have more active visual cortices. Herein, we demonstrate that, depending on the models' initial conditions, some sparse coding models can exhibit increasing activity levels while learning the same types of RFs that are observed in visual cortex: the developmental data do not preclude sparse coding.

A central question in systems neuroscience is whether optimization principles can account for the architecture and physiology of the nervous system. One candidate principle is sparse coding (SC), which posits that neurons encode input stimuli

Throughout this paper, we make reference to the notion of “sparseness”. Intuitively, sparseness is related to there being either a small subset of neurons active at any time (population sparseness), or to each neuron being active only a small fraction of the time (lifetime sparseness)

In further support of the SC hypothesis, measurements of the firing rates of V1 neurons in response to videos of natural scenes show that those rates are low, and that the firing rate distributions are sharply peaked near zero

In simulating the development of a sparse coding model, one typically

Multi-unit activity in primary visual cortex (V1) of awake young ferrets watching natural movies shows decreasing sparseness over time. The sparseness metrics shown in this figure are defined in the results section of this paper, and the data are courtesy of Pietro Berkes

The above discussion hints at a major source of confusion in this area of research. In particular, sparseness is discussed as both a relative measure (

The ferret sparseness-over-time data appears to contradict the SC hypothesis. At the same time, that hypothesis has otherwise been quite successful in explaining some key features of peripheral sensory systems. It is therefore natural to ask whether sparse coding models necessarily

We will also see that, for appropriately chosen initial conditions, the same can be true of the canonical SparseNet model of Olshausen and Field

Since this paper focuses primarily on our SAILnet model (

In our model, described in detail elsewhere

The neurons' firing thresholds are modified over time so as to maintain a target lifetime-average firing rate. For our LIF neurons, this is similar to synaptic rescaling, which has been proposed as a mechanism to stabilize correlation-based learning schemes

Finally, the feed-forward weights are learned by the network, so that the neuronal activities form an optimal linear generative model of the input stimulus, subject to the constraints imposed by limited firing rates and minimal correlations. The derivation of our learning rules from this objective function is presented in

To study the change in sparseness over time, we ran SAILnet simulations, starting with randomized feed-forward weights, recurrent connection strengths, and firing thresholds that were initialized with Gaussian-distributed white noise. At different times during the development process, we recorded the simulated neuronal activity in response to randomly selected batches of natural images. Following a recent experimental study

The first of these, the “activity sparseness,”

The “population sparseness”

Finally, the “lifetime sparseness”

Intuitively, all of these measures are somewhat related (although, see

In order to further facilitate meaningful comparison with the experiment of Berkes and colleagues, we mimicked a multi-unit activity measurement by randomly grouping together sets of 8 SAILnet neurons, whose activities were then summed to form a multi-unit response. These “multi-unit” activities were used for computing our sparseness measures. This procedure yielded results (

A SAILnet simulation was performed in which the RFs, firing thresholds, and recurrent connection strengths were initialized with random numbers (see

In

The time course of the sparseness measures depends on the learning rates (parameter modification step sizes), with smaller learning rates leading to slower changes in sparseness measures, as expected (data not shown). The depth of the observed “undershoot” also depends on the initial conditions and the learning rates. The specific activity sparseness values (

The receptive fields learned by the model, while displaying decreasing sparseness, are in good quantitative agreement with a measured corpus of 250 macaque monkey V1 simple cell receptive fields, as we will demonstrate in the section on comparisons of receptive field shapes.

The model discussed in this section and shown in

We find that SAILnet does not

In this case, the relatively low firing thresholds and relatively small amount of lateral inhibition lead to the initial network state being less sparse than the final (equilibrium) state, so sparseness increases over time (

A SAILnet simulation was performed in which the RFs were initially randomized, and the recurrent inhibitory connection strengths and firing thresholds were initialized with random numbers that were smaller than for the simulation described in

Similar to

We emphasize that, compared to the model discussed in

While our SAILnet model

To explore this issue more fully, we return to the canonical SparseNet model of Olshausen and Field

In what follows, we use the SparseNet code of Olshausen and Field

We begin by initializing these basis functions with Gaussian white noise of variance

To check that our conclusions apply to other models besides SAILnet, we performed simulations with the publicly available SparseNet code of Olshausen and Field

These changes can be understood by recalling that, during inference — where the activities of the units are determined in response to a given image — the activities are chosen to minimize the following cost function:

Putting all of this together, if we initialize the bases with Gaussian white noise of variance

As in SAILnet, one way to understand these trends is to recall that the model parameters dictate the final “equilibrium” state of the model, but the initial conditions can be chosen independently of the final state. As such, initial conditions can be chosen to be either more or less sparse than the equilibrium condition, leading to sparseness either decreasing or increasing over time.

In many theoretical studies (

Typically, this comparison is either done by eye (as in

The by-eye comparisons are not very quantitative, even if they first involve fitting parameterized shape models, and any fitting of parameterized shape models is vulnerable to failures of the shape function: any RFs whose shapes are not well described by the parameterized function will yield nonsense best fit parameter values.

To get around these difficulties, we introduce a novel method for directly comparing the shapes of theoretical and experimental receptive fields, using image registration. In this technique, we assume that receptive fields may differ by a translation, rotation, and/or global size rescaling, yet still have the same shape. For example, consider an equilateral triangle within a bounding box. A shifted, rotated, and resized version of that shape is still an equilateral triangle.

We apply this intuition to the comparison between our model receptive fields, and a set of 250 macaque V1 receptive fields courtesy of D. Ringach. We do this by taking each experimentally measured V1 receptive field and then for each model RF we find the combination of translation, rotation, and overall rescaling that gives the best match between the experimental and transformed-model RF. We quantify the match by the

Once we have done this for all model RFs, we take the one whose best transform yields the largest

Looking at the macaque RFs in

(

In using this technique to estimate the noise level, we essentially assume that the RF shapes are repeated in the experimental data (which one can see in

In

The macaque-to-macaque comparisons (

Because the model RFs can explain an average of roughly

Generally, larger networks have a greater diversity of receptive field shapes (this can be easily seen by comparing the RFs shown in this paper to those in

Our quantitative non-parametric RF comparison method could be used to compare many different theories to experimental data, and thus to ascertain which ones provide the best fit. That comparison is beyond the scope of this paper.

We have demonstrated that a computational model (SAILnet

In order to quantify the similarity between experimentally measured V1 receptive fields and the receptive fields learned by our SAILnet model, we have further introduced a novel non-parametric RF comparison tool based on image registration techniques.

Since sparseness can decrease during development, with the mature network state still performing sparse coding, the type of active sparseness maximization disproven by recent experiments

One possibility that requires consideration is that the sparseness data of Berkes and colleagues

In order for this comparison to be “fair,” one must ensure that receptive fields undergo significant change during the developmental period over which Berkes and colleagues measured sparseness. Indeed, other experimenters have observed that the orientation and direction selectivity of the neurons in ferret V1 increase

Of course, there could always be other reasons — beyond the scope of this paper — why the developmental data fail to be relevant to the sparse coding hypothesis. We leave that question for future work.

For the sake of completeness, we note that Rochefort and colleagues have observed that

We are not the first to propose that homeostasis might underlay experience-dependent modification of the nervous system. Indeed, Marder and others have strongly and persuasively argued that neural systems might have a desired operating point such that when perturbed they use homeostatic mechanisms to return to that desired functional state

Finally, recent work by Perrinet

We have demonstrated that the mature network state can perform sparse coding regardless of whether learning is accompanied by an increase or decrease in sparseness. At the same time, sparse coding is not the only principle that has been proposed in order to understand V1 function. Of particular interest in this regard is a recent study by Berkes and colleagues

There are many different ways to measure sparseness. In this work, we follow the experimental study of Berkes and colleagues

First, the “activity sparseness” (

In addition, we recorded two other sparseness measures, originally due to Treves

The SAILnet model

For all simulations shown herein, the feed-forward weights

For the data shown in

For the results shown in

We note that in all cases, the specific shape of the sparseness vs. time plot depends on the choice of initial conditions. However, for a large class of initial conditions, the sparseness will decrease over time, and for another large class of initial conditions, it will increase (data not shown). Thus, our qualitative conclusion is not particularly sensitive to the exact numerical values described above.

The SparseNet results were generated using code publicly distributed by Bruno Olshausen (

In both cases, 256 units were used, and the model was trained on

To perform the quantitative comparison of receptive field shapes, we used the image registration tool in MatLab

Our data consists of 250 Macaque V1 receptive fields, measured using reverse-correlation methods in the lab of Dario Ringach

For each macaque RF, we performed an exhaustive search over all model RFs, wherein we found the best similarity transform to match each model RF to the macaque RF, then took the best-matching model RF (with the appropriate best similarity transform), as the fit. The

To generate a benchmark to assess how good a “good”

The ratio between these numbers – the data-vs.-model

The authors thank Pietro Berkes for sharing the ferret V1 developmental data points and Dario Ringach for providing the macaque V1 receptive field data. The authors are grateful for helpful comments from Fritz Sommer.