Slow Feature Analysis Yields a Rich Repertoire of Complex Cell Properties. Berkes, Wiskott. Journal of Vision 2005.


  1. Uses SFA to learn receptive fields
  2. Receptive fields learned are similar to those found in cells in V1
  3. Receptive properties also match those “… in complex cells, such as direction selectivity, non-orthogonal inhibition, end-inhibition, and side-inhibition
  4. Neurons in V1 are divided between simple and complex cells – they are both edge or line detectors, but
    1. Simple cells are sensitive to lines at a particular position and orientation, while
    2. Complex cells respond to orientation but not position
  5. “Idealized simple and complex cells can be described by Gabor wavelets… A single Gabor wavelet used as a linear filter (…) is similar to a simple cell, because the response depends on the exact alignment of a stimulus bar on an excitatory (positive) subfield of the wavelet.”
  6. A pair of Gabors with a 90 degree phase difference can be used to create a complex cell
  7. This however, is a slightly oversimplified picture of what goes on in V1
    1. Cells also “… show end-inhibition, side-inhibition, direction selectivity, and sharpened or broadened tuning to orientation or frequency (…)”
  8. Approach taken here is to assume that the visual system is adapted to satisfy some computational objective
    1. “The computational approach does not necessarily provide an explanation of the cortical mechanisms involved in the computation, but it can give a powerful functional explanation of experimental data”
  9. Here they base the optimization on slowness, use SFA
  10. Consider degree-2 polynomials as basis functions
  11. “An early description of this principle was given by Hinton (1989, p.208)…”
    1. Also applied to stereograms (Stone, 1996)
  12. “It is important to note that even though the objective is the slowness of the output signals, the process by which the output is computed from the input is very fast or in the mathematical idealization even instantaneous.  Slowness can therefore not be achieved simply by low-pass filtering.”
  13. The goal is to find functions that when applied to the signal, minimize the squared derivative
  14. There is also another measure of slowness based on the sqrt of the derivatives of the output
  15. Images used were naturalistic images, there was some preprocessing done
  16. Moved a 16×16 window over the images by translation, rotation, and zoom+rescaling, and were all done simultaneously
    1. Basis function is polys of degree 2, do a prepossessing PCA step to do dimension reduction because otherwise there is an O(N^4) cost, reducing the dimension from 512 to 100, still capturing 93% of the total variance
  17. To analyze units, they compute the optimal excitatory and inhibitory stimulus. “This is in analogy to the physiological practice of characterizing a neuron by the stimulus to which the neuron responds best (…).”
  18. <They talk about responses at t and Δt, which I don’t understand, seems like it just has to do with applying 2 different inputs and testing for motion sensitivity based on the differences of the two inputs, but the same slow features are used>
  19. The optimal excitatory and inhibitory stimulus in most cases look like Gabors
  20. Hexagonal-shaped images are used to test end- and side-inhibition
  21. Compute “… 5150 polynomials of degree 2.” for slow features; keep many of the slow features – throw out those that have signal that varies faster than the input
  22. Performance between test and train data is pretty well maintained
  23. They keep the first 100 slow features
    1. “We did not find any unit among the first 100 whose properties were in contradiction with those of neurons of V1.”
  24. “The optimal stimuli of almost all units look like Gabor wavelets (Figure 3), in agreement with physiological data.  This means that the units respond best to edgelike stimuli.  The response of all these units is largely invariant to phase shift as illustrated by the lack of oscillations in the response images (Figure…).”
  25. Each unit is presented with a sine grating determined by “… the relative modulation rate F1/F0 (i.e. the ratio of the amplitude of the first harmonic to the mean response).”  If this value is < 1 they are complex, if it is more, they are simple.
    1. All units respond to relative modulation rate < 1 so they would be considered complex cells
  26. “In the classical model, complex cells have no inhibition and are correspondingly restricted in their functional properties (…). In physiological neurons, however, active inhibition is present and useful…”
  27. Inhibition does exist in the functions found, is significant in terms of the actual responses, and takes the form of Gabors; the orientation is often not orthogonal which means tuning is for a narrow orientation.
  28. Distribution of spatial turning is not significantly different from those found in cells in V1 in macaque.
  29. Cells in V1 can also be selective for:
    1. Length: end-inhibited cells
    2. Width: side-inhibited cells
    3. Otherwise, cells are not sensitive to these parameters, and are just selective of orientation
  30. Cells are found that have these properties as well
  31. Some end- and side- inhibited cells are also selective for curvature which was found here
  32. “Complex cells in V1 are sensitive to the motion of the presented stimuli.  Some of them respond to motion in both directions while others are direction-selective (…).”
    1. They also get a mixture of cells that are sensitive to motion, with a mixture of direction-sensitivity and insensitivity
  33. “The slowest units are usually less selective for orientation and frequency, have orthogonal inhibition, and their preferred speed is near zero.  Units with non-orthogonal inhibition, direction selectivity, and end- or side- inhibition predominate in a faster regime.”
  34. <Ignoring control experiments, although they examine how different transformations used to create training set lead to (or the lack of) certain types of cells.>
  35. <Discussion>
  36. Many other works have reproduced properties of simple cells, based on principles of:  sparseness, statistical independence, as well as slowness.
    1. These previous works, however, don’t include inhibition, and  “… many of the illustrated complex cell behaviors are impossible to obtain without it.”
  37. “To our knowledge, the model presented here is the first on based directly on input images that is able to learn a population of units with a rich repertoire of complex cell properties, such as active inhibition, secondary response lobes, end-inhibition, side-inhibition, direction selectivity, tonic cell behavior, and curvature selectivity.”
  38.   They use polys of degree-2 mainly because higher order polys increases computational costs significantly.
  39. This brings up the question of what type of functions neurons can compute?  “Lau, Stanley, and Dan (2002) have fitted the weights of a nonlinear two-layer neural network <just a linear combination?> to the output of complex cells.  They found that the relation between the linear output of the subunits and the output of the complex cell is approximately quadratic (mean exponent 2.3 +- 1.1)… [so] considering the space of polynomials of degree 2 might be sufficient.”
    1. There is some other work that suggests degree 2 would make sense
  40. “Although the function space of polynomials of degree 2 is mathematically attractive and has proved to be appropriate in experimental and theoretical studies as discussed above, it is not able to encompass all input-output nonlinearities of visual neurons.  Divisive contrast gain control (…), saturation effects, and pattern adaptation are examples of nonlinear effects present in the visual cortex that cannot be realized.”
  41. “Temporal and spatial slowness are closely related concepts.  For example, in our model, temporal slowness could be reformulated as a spatial one by adapting each unit to respond in a similar way to neighboring visual regions.  The slowness objective could thus be reformulated as a spatial optimization criterion (Wiskott & Sejnowski, 2002).  However, the former seems more natural to us and easier to implement in a biological system.”
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: