Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. Le, Zou, Yeung, Ng. CVPR 2011.

  1. Instead of using hand-coded features like SIFT for processing static images in a video, here unsupervised learning is used to generate feature detectors.
  2. Use extension of Independent Subspace Analysis <I don’t know what that is – ah they say its an extension of independent component analysis>
    1. Used in conjunction with deep learning methods lick stacking, convolution to generate hierarchical representations
  3. Method beats previously published results on a number of datasets
  4. Previous results show that ISA can generate receptive fields simialr to V1 and MT
  5. As opposed to ICA, ISA “… learns features that are robust to local translation while being selective to frequency, rotation, and velocity.”
  6. On the other hand, ISA/ICA scale poorly to high dimensional data, so it is modified here to work well in high-D, by leveraging ideas from convnets: convolution and stacking
  7. In comparison to previous state of the art, all steps of processing are the same aside from the first level of video processing, replacing hand-designed features with learned ones.
    1. <Since I am exactly interested in the latter parts of processing, I’m going to leave this paper now.>

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: