Animating the Dead: Computational Necromancy with Reinforcement Learning. Bill Smart, et al. 2008


His talk over at MSR.

  1. Do manifold learning
  2. Give up guarantees of global optimality
  3. Results on swimmer are “easy” for the algorithm because its easy to find the manifold
    1. Domains with cyclic actions tend to exist on a ring-type manifold/high-dimensional cylinder which connects back when the gait loops back to a certain point
  4. Manifolds are best in domains with constraints, gives up performance on arbitrary MDPs
  5. Works with reward function, not just path planning
  6. Control works by fitting a locally quadratic value function and planning a path through state space that gives good reward
  7. Robustness of controller depends on smoothness of domains
  8. Needs to be “bootstrapped” to reasonably good policies to learn in some domains:
    1. In swimmer domain things are smooth enough that it can start out doing random stuff and improve on that
    2. In the walker domain it has to start with a policy that can at least stand upright
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: