Regularized Sparse Kerne Slow Feature Analysis. Bohmer, Grunewalder, Nickisch, Obermayer. PKDD 2011 Talk.


SFA must still be pretty esoteric, because at the beginning of the talk (in PKDD 2011) he asked if anyone knew anything about SFA and seemingly nobody raised their hands.

http://videolectures.net/ecmlpkdd2011_boehmer_regularized/

  1. SFA is unsupervised learning for time series
  2. Related to graph spectral analysis
  3. About how to apply linear algorithms to nonlinear data sets
  4. Application here for spoken word identification?
  5. Idea is to map the original stimulus to some other feature space.
  6. Need to extract a functional basis in terms of latent variables
  7. Want a low dimensional feature space embedding
  8. Often this is done by hand, but its not always easy
  9. For unsupervised learning you could otherwise try PCA, but this just tries to recreate the original thing and not the latent variables
  10. SFA wants to minimize the temporal derivative in terms of the l2 norm
  11. Given an infinite time series and unrestricted function class, the features that emerge from SFA are a Fourier bases in the space of the latent variables.  With the first features encoding the slowest changing features, and the higher ones are more rapidly changing features
    1. This assumption (unrestricted function classes) is quite strong
  12. Here the algorithm is formualted for kernels
  13. O(n^3)
  14. But the kernel SFA approach exhibits overfitting and numeric instabilities
  15. So then try to add regularization to fix these problems in KSFA, but it adds extra paratmers.  Regularization must be tuned for each kernel
  16. But then theres another way to do regularlization, then just subsample training data <not sure how this does what its supposed to>
  17. This method is more efficient than the other approach for regularization and produces better results
  18. Now going onto spoken vowel recognition (although said these results are for an upcoming journal paper)
  19. Outperforms working in the raw input space with less than 10 features
  20. Outperforms Kernal-PCA
  21. “Take home message”:
    1. Context: Linear classification/regression wrt latent variables
    2. Data: Complex time series with reasonable kernel
    3. Problem: No idea how to construct proper feature space
    4. Suggestion: Try RSK-SFA to approximate Fourier basis over latent variables
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: