Slow Feature Analysis for Human Action Recognition. Zhang, Tao. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE. 2012

  1. Considers 4 forms of SFA for action recognition (apparently the other 3 are new for this work):
    1. (original) unsupervised SFA
    2. supervised SFA
    3. discriminitve SFA
    4. spatial discriminitive SFA
  2. Train SVM on slow features of squared input <not exactly, see ASD in #6>
  3. Test on a number of databases with good results
  4. ” First, a large number of local cuboids are collected by randomly sampling in motion boundaries. Then, a number
    of slow feature functions are learned from these local cuboids.” <I don’t know what a cuboid is>
  5. “To the best of our knowledge, this is the first work that uses the slowness principle or SFA to analyze human motions.”
  6. “Instead of using the responses of the learned slow feature functions directly, we propose the Accumulated Squared Derivative (ASD) feature to represent a given action sequence which is a statistical representation of the slow features in an action sequence.”
  7. 3 common classes of action recognition algorithms:
    1. Holistic features: global properties like body shape, joint angles, motion.  Require good segmentation and tracking; sensitive to noise and tracking errors
    2. Local descriptors: Find points of interest, bag of words, some other stuff
    3. Biologically inspired: Motion sensitivity inspired by visual cortex.  SFA is also somewhat biologically motivated but has a very different approach
  8. “There are four main steps in the SFA-based human action recognition, including Collection of training cuboids, Slow feature function learning, Action feature representation,and Classification.InSlow feature function learning, we extend the original SFA by using weakly supervised information and spatial information of the training cuboids to obtain discriminative slow feature functions for action classification.”
  9. Finding cuboids basically means finding what part of the image is of interest (contains a person in the foreground).  The have some presupplied information that comes with the data set (bounding box, cuboid is more the actual shape of the foreground figure as composed of many overlapping squares/cubes) that makes this easier.
  10. They use quadratic basis, but since that is too big, then use PCA to project down to 50 dimensions, some other preprocessing
  11. In supervised SFA, instead of running all data through the same algorithm, SFA is run independently on the data for each action of interest. “Finally, the statistical feature is computed with all slow feature functions. However, different actions may share many similar local motion patterns, so different labels to these “common” cuboids are misleading.” <not sure what this means exactly>
  12. “D [descriptive]-SFA is inspired by discriminative sparse coding [45], where a number of sets of discriminative dictionaries are learned, and each set of dictionaries is used to reconstruct a specific image class. Accordingly, D-SFA learns a number of sets of functions and each set of functions is used to slowdown a specific action class.”
    1.  “Therefore, each learned function makes the intraclass signals x_c(t) vary slowly, but makes the interclass signals x_{c’}(t) that are different from class c vary quickly.”
    2. Results in a similar set of constraints as standard SFA, but x values are now c [class]-indexed
  13. spatial discriminitve SFA adds spatial information.  The motivation is that some activities might involve more motion in a certain region (for example, upper could correspond to arm motion, while lower to leg motion)
    1. Here they do upper, mid, and lower
  14. For standard SFA “Each cuboid is considered as one minisequence. The covariance matrix and the time-derivative covariance matrix are calculated by combining all minisequences.”
  15. Do L1 normalization on the cuboids because the number of them may vary but the vectors representing the data must be the same length
  16. Bottom of p 441 has more in-depth description of algorithm run end-to-end
  17. The ASD is the concatenation of multiple slow feature functions.
  18. Classification is done via SVM on ASD
  19. More technical details in experimental results section
  20. <low on energy not taking much further notes>

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: