A Survey on Vision-based Human Action Recognition. Poppe. Image and Vision Computing 2010.

  1. Consider task of annotating action in video
  2. Task is a combination of feature extraction <motion trackers at least solve this>, and then classification
  3. One taxonomy for the task breaks actions down between:
    1. Action primitives: low level (such as limb) movements ex/move left leg
    2. Actions: built of primitives, may be composed of contemporaneous primitives (such as arm and leg movement), may be cyclic. ex/run
    3. Activities: a task that can be given a label. ex/jumping hurdles
    4. <These examples are from the text, and perhaps not the best example, because certainly sometimes running is the entire activity?>
  4. Here do not consider environmental context, multi-person interaction, or objects
    1. Consider only full body movement and not gesture recognition
  5. In the field of gait recognition, the goal is to differentiate people based on the way they walk; here the goal is the opposite – that is to look at multiple people doing different tasks and figure out in which cases they are the same task
    1. Although there is more recently work that tries to do both – identify activity and individuals doing the activity
  6. Pose reconstruction is akin to a regression problem, whereas activity recognition is a classification problem
  7. <Didn’t get to finish-posting for spring cleaning>

