Finite-Sample Analysis of Lasso-TD. Ghavamzadeh, Lazaric, Munos, Hoffman. ICML 2011.

Brief notes from a very light read – look at the authors and you know that you can either read this paper in 5 minutes, or 5 weeks

  1. Theoretical results encompass both LARS-TD and LC-TD.  They call their method Lasso-TD.
  2. Methods form a contraction mapping, so they converge
  3. A Lasso-TD solution always exists (although it may not be unique) and has a unique value function
  4. Performance bounds are derived “estimation error with the rate O(||α||1 n -1/4).”  Additional assumptions improve this to O(||α||0 n -1/2)
  5. Theorem connects “… prediction performance of Lasso-TD (which is an L1-regularized fixed point problem) and the L1-norm of the best approximation of V in Fn.”
  6. As number of features d increases wrt number of samples n, performance of vanilla LSTD gets continually worse, while Lasso-TD only degrades at (~O(log d))
  7. “… in order for Lasso-TD to be effective, a sparse approximation of the value function must exist.”
    1. They go on to talk about some properties of an MDP that would cause this property to exist
  8. Domains with smooth rewards and dynamics will have smooth value functions.
    1. “From approximation theory, we know that piecewise-smooth functions can be accurately approximated by localized features, such as wavelets (…).  As a result, if we define F as the linear function space spanned by a set of d wavelets, any value function is likely to be well-approximated by functions in F with a relatively small number of non-zero coefficients.”
  9. Also mentions LSTD-RP (LSTD with random projections <on my reading list>).
    1. In the general unrestricted case, LSTD-RP works better than Lasso-TD.
    2. When sparsity does exist, however, Lasso-TD works better

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: