Slowness and Sparseness Lead to Place, Head-Direction, and Spatial-View Cells. Franzius, Sprekeler, Wiskott. Plos Computational Biology 2007.

  1. Deals with place cells, which encode position and head direction
  2. Also mentions head-direction cellsgrid cells, and spatial-view cells
    1. Spatial-view cells don’t encode the pose of the animal, but rather a view of the environment
  3. Head direction cells have a Gaussian sensitivity that covers between 60 to 150 degrees of area
  4. Place cells are generally sensitive only to location and not head position, but the degree to which it is independent of head position depends on the task and environment structure
  5. Here the term oriospatial cells will be used to refer to all the types of cells mentioned in 2
  6. Stimulus the oriospatial cells operate on can be either:
    1. Idiothetic, involving motor feedback, proprioception, and vestibular input
    2. Allothetic, information about the outside world such as vision or olfaction
  7. Place cells are primarily driven by visual input, but experiments show that idiothetic information must be incorporated as well
  8. Path integration (or dead reckoning) occurs when idiothetic information is used to navigation
    1. Error when doing so accumulates over time and can only be corrected by allothetic information
    2. <Seems like this means something else than what it is in the formal planning literature.>
  9. Here they model the “self-organized formation of hippocampal place cells, head-direction cells, and spatial-view cells based on unsupervised learning on quasi-natural stimulus.”
    1. Model has no memory
    2. Takes raw visual input – must operate directly from this
  10. Because there is no memory, cannot do path integration
  11. “While such a model can certainly not be a complete model of oriospatial cells, it can show how far a memoryless purely sensory-driven system can model oriospatial cells.”
  12. The idea behind SFA is that some information, such as individual pixel values change quickly, but the meaning of a scene changes slowly
  13. Low-pass filters can create slowly varying outputs, but they are generally not informative, and are not instantaneous
  14. SFA has previously been “successfully applied as a model for the self-organized formation of complex cell receptive fields in primary visual cortex.”
  15. Looks like they do a multilevel SFA here
  16. “We find that the output of the highest layer performing SFA forms a distributed oriospatial representation.  In a subsequent linear step, the model applies a mechanism for sparse coding resulting in localized oriospatial codes.”
  17. “For roughly uncorrelated head direction and body movement, the system learns head-direction cells or place cells depending on the relative speed of head rotation and body movement.  If the movement statistics is altered such that spots in the room are fixated for a while during simulated locomotion, the model learns spatial-view cell characteristics.”
  18. Attempt sparsification with the claim that it would allow for easier decoding, energy efficiency, and storage
  19. Experiments based on what a virtual rat would see doing brownian motion
  20. 3 layers of SFA, also does clipping in between, additive Gaussian noise between layers
  21. At the lowest level, each component has a receptive field that is 10×10 RGB, they partially overlap
  22. Layer sizes:
    1. 7 x 63
    2. 2 x 15
    3. 1
  23. The final output is fed into independent component analysis
  24. Layers are trained separately from bottom to top
  25. Outputs can be characterized into 3-space of position and head location <is the output 3D?>
    1. Call 2D sections with average over head direction spatial firing maps
    2. And 1D sections averaged over position orientation tuning curves
    3. The pose of the (virtual) rat is denoted by s, its perception at s is x(s).
    1. The manifold of possible configurations is V.  “Note that V in general does not have the structure of a vector space.”
  26. In a complex environment, there is a 1-1 correspondence between poses and what is observed.
  27. My empahsis: “This leads to a simplified version of our problem.  Instead of using the images x(t), we use the configuration space s(t) as an input signal for our learning task.
  28. What is in the state – I think it is 3D as described above as position and head angle
    1. <But this doesn’t capture state as other things like velocity are also part of state>
    1. “…the optimal functions depend on the velocity statistics of the input <training> signal”
    2. Assumptions have to be made for example that motion of the rat is ergodic, and usually responds in a similar way to the same input etc.
  29. Talk about working in the joint distribution of pose and velocity, but then they integrate out the velocity and are left back with the pose
  30. The math here goes more in detail and reminds me of SVMs with the Lagranges
  31. <These guys definitely know their math… much follows.  Then:>
  32. “The key insight of this analysis is that the optimal functions show oscillations that are spatially compressed in regions where the rat moves with low velocities.  This implies that the spatial resolution of the SFA solutions is higher in those regions.  Consequently, the size of the place fields after sparse coding should be smaller in regions with small velocities, which might explain smaller place fields  near arena boundaries.  If we assume the animal moves faster parallel to a wall of the arena than perpendicular to it, our theory predicts elongated place fields along the walls that might be similar to the crescent-shaped fields reported in … for a circular arena.”
  33. Theoretical results show that for high velocity, system output is slowest if it is invariant to the head and only encodes spatial position.  In contrast, for low velocity invariance for position while coding for head orientation is best
    1. This is bore out in the (simulated) experiment
  34. The first run of the experiment, however, was oversimplified as it assumed the other parts of the system already exist when in reality they must be developed simultaneously.  Additionally, simplifications were made to the movement mode that made learning simpler
  35. In another experiment fixation is held at a particular point while movement occurs
  36. <Mostly blowing through this>
  37. <I am very confused>:”Although most of the parameters in our model (i.e., all the 

    weights in the SFA and ICA steps) are learned in an unsupervised manner, a number of parameters were chosen

    by hand. These parameters include the input picture size, receptive field sizes, receptive field positions, and overlaps in

    all layers, the room shape, and textures, the expansion function space, number of layers, choice of sparsification

    algorithm, movement pattern, FOV, and number of training steps. We cannot explore the entire parameter space here

    and show instead that the model performance is very robust

    with respect to most of these parameters.”

  38. The generated representations were coding specifically for some information (e.g., position) and were invariant to the others (e.g., head direction).”
  39. Whereas grid cells form a hexagonal grid, here the distribution depends on the shape of the environment
  40. The strong influence of room shape on the SFA results is due to the temporally global decorrelation and unit variance constraints in SFA. Thus, SFA requires a decorrelation of activities over arbitrarily long timescales, which might be difficult to achieve in a biologically plausible manner
  41. “Although the model is capable of learning place cells and head-direction cells, if it learns on distinct adequate movement statistics, a model rat should obviously not have to traverse its environment once with low relative rotational speed to learn head-direction cells and once more with high relative rotational speed to learn place cells.”
  42. “Nevertheless, if the real movement statistics contains very few episodes of relatively quick transition at all, the mechanism fails and head-direction cells cannot become position invariant.”
  43. Although the eigenvalue decomposition can’t be considered biologically plausible, a similar algorithm can function in a gradient descent mode which is more plausible.  Similarly ICA also does not seem plausible but there are other methods that may be more plausible and serve a similar function
  44. <Related and future work sections>
  45. To our knowledge, no prior model allows the learning of place cells, head-direction cells, and spatial-view cells with the same learning rule. Furthermore there are only a few models  that allow clear theoretical predictions, learn oriospatial cells from (quasi) natural stimuli, and are based on a learning rule that is also known to model early visual processing well.”
  46. <Future work>
  47. Aside from movement of the rat, the environment is otherwise static – how would these methods work in a changing environment?
    1. Other changing features of the environment (such as shifting lighting due to movement of the sun) would be even slower features that may be picked up by SFA.  This is probably not good, but there are some biologically-based predictions that may be borne out by an experiment in this case.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: