- Automatic derivation of motion primitives and means of combining them from human motion data using isomap
- For robot control, useful to have a restricted set of primitives to search over to do motion instead of planning in full space which is huge. Sometimes this set can be defined by hand, but this has obvious limitations (time-intensive, requires domain expertise, even then may be incorrect)
- Paper extends isomap to extract spatio-temporal structure
- Finding motion primitives relies on additional clustering and interpolation
- Here they use the method for motion synthesis
- “In Motion Textures, <another approach> motion segments are learned so they can be accurately reproduced by a linear dynamical system within some error threshold.”
- Here the goal is “… to produce primitive behaviors that have an observable theme or meaning.”
- Method here requires some domain knowledge
- A number of references, but this work and the motion texture seem the most relevant
- “Each extracted feature represents a group of motions with the same underlying theme and can then be used to construct a primitive motion module realizing the theme.”
- Using straight PCA has been used, but the features that come out from that aren’t really useful
- Can also do PCA and then clustering on top of that, but unless the # of clusters is known a-priori, the results also come out pretty meaningless. Also the clustering just gives meaningful results on data that falls within the corpus used. It doesn’t extrapolate well
- Using vanilla isomap, locally linear embedding, and kernel pca will all have same qualitative drawbacks as vanilla pca
- Here they do Isomap, but add temporal component
- “The Isomap embedding unravels the spatial structure of the S-cure removing the ‘s’ nonlinearity, producing the flattened data indicative of the 2-manifold structure of the S-curve. However, the model that has generated the data are a 1-manifold with an S-curve nonlinearity and multiple translated instances. Spatio-temporal Isomap produces an embedding indicative of this 1-manifold structure. This embedding both unravels the S-curve nonlinearity and collapses corresponding points from multiple instances of the S-curve to a single point.”
- The idea behind isomap (as well as kernel PCA) is to do eigen decomposition on a similarity matrix in feature space as opposed to a covariance matrix (which is used in PCA)
- “The feature space is a higher dimensional space in which a linear operation can be performed that corresponds to a nonlinear operation in the input space. The caveat is that we cannot transform the data directly to feature space. For performing PCA in a feature space, however, we only require the dot-product (or similarity) between every pair of data points in feature space. By replacing the covariance matrix C with the similarity matrix D, we fit an ellipsoid to our data in feature space that produce nonlinear PCs in the input space.”
- Spatio-temporal isomap works the same way, but an extra step is added to deal with temporal dependencies. As compared to vanilla isomap “If the data have temporal dependency (i.e., a sequential ordering), spatial similarity alone will not accurately reflect the actual structure of the data.”
- They tried two method of dealing with temporal dependencies
- Spatial neighbors are replaced by adjacent temporal neighbors, which introduces a first-order Markov dependency into the embedding
- Creates connected temporal neighbors (CTNs), as well as CTN connected components
- “Points not in this CTN connected component will be relatively distal. Thus, CTN connected components will be separable in the embedding through simple clustering.”
- Processing kinematic motion:
- 1st iteration makes clusterable groups of motion, primitive feature groups
- Interpoliation is done on primitive feature groups to make parameterized primitive motion modules
- Spatio-temporal isomap is again applied to first embedding to make more clusterable groups of motion called behavior feature groups
- From a behavior feature group, meta-level behavior encapsulates component primitives and links them with transition probabilities
- Feels like its doing things pretty similar to slow feature analysis, “Incremental Slow Feature Analysis:
Adaptive Low-Complexity Slow Feature Updating from High-Dimensional Input Streams” does mention this work as similar but a little different. - First step requires motion segmenting. They use both ground truth as well as an algorithm called kinematic centroid segmentation
- SFA doesn’t require this, the segmenting falls out by itself
- <I think this is a bit of a deal breaker for us>
- Seems like they use a pretty simple clustering method <they call it “sweep and prune”, never heard of it>
- Set of motion segments in each feature group are treated as exemplars. Variations on that can be constructed by interpolating (many different ways of doing the interpolation) between exemplars
- Processing sequence does:
- primitive features -> behavior features -> meta-level behaviors
- <skimming the rest because the segmentation requirement for initial processing probably makes this unusable for us>