Ideas from Fitted Q-iteration in continuous action-space MDPs: Andras Antos, Csaba Szepesvari, Remi Munos


  1. Paper outlines a modified fitted q-iteration algorithm in stochastic, continuous state, action spaces
  2. Requires a set of possible policies to be selected from
  3. Requires some smoothness assumptions, such as Lipschitz, points out that in general, smoothness is probably required for any solving of continuous MDPs
  4. Policy search generally done by some gradient method
  5. Quite a few (9) assumptions are made, but I don’t understand them all well
  6. Main contributions:
    1. First finite-time bounds for continuous-state and actionspace
      RL that uses value functions
    2. Frst analysis of fitted Q-iteration, an algorithm that has proved to be useful in a number of cases, even when used with non-averagers for which no previous theoretical analysis existed
Advertisements
Tagged , , , , ,

One thought on “Ideas from Fitted Q-iteration in continuous action-space MDPs: Andras Antos, Csaba Szepesvari, Remi Munos

  1. Michael Littman says:

    Cool. Author(s)?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: