Emergence of Simple-cell Receptive Field Properties by Learning a Sparse Code for Natural Images. Olshausen, Field. Letters to Nature 1996.

  1. Considers responses of simple cells in mammal visual cortex as a function of natural images
  2. Earlier approaches attempted to build feature detectors like those in visual cortex, but all have failed to satisfy all the properties simple cells exhibit
  3. Here they propose a method based on sparseness and claim it can satisfy all properties <seems like this is also shown empirically>
  4. Looking to find a set of basis functions that describes natural images, with coefficient values on each basis being “… as statistically independent as possible…”
  5. PCA is good for data that is Gaussian and has other properties, but there is good reason to believe natural images do not exhibit these properties
  6. “We conjecture that natural images have ‘sparse structure’–that is, any given image can be represented in terms of a small number of descriptors out of a large set–and so we shall seek a specific form of low-entropy code in which the probability distribution of each coefficient’s activity is unimodal and peaked around zero.”
    1. It is an optimization problem where there is tradeoff between information loss and sparseness of description
  7. Also, cost function encourages weights to be spread evenly across features
  8. Do gradient descent to optimize basis functions <not exactly clear how that is done, although I am not reading so carefully>
  9. Basis functions may be overcomplete
  10. Results show “The vast majority of basis functions are well localized within each array (with the exception of the low-frequency functions).  Moreover, the functions are oriented and selective to different spatial scales.  This result should not come as a surprise, because it simply reflects the fact that natural images contain localized, oriented structures with limited phase alignment across spatial frequency.”
  11. Features learned are similar to wavelet codes <not sure what that means>
  12. “The results demonstrate that localized, oriented, bandpass receptive fields emerge when only two global objectives are placed on a linear coding of natural images: that information can be preserved, and that the representation be sparse.”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: