Ideas from: Binary Action Search for Learning Continuous-Action Control Policies, Pazis and Lagoudakis

  1. Algorithm, called Binary Action Search (BAS) approximates a continuous action space to an arbitrary resolution via binary decisions which allow for (near, up to a specified resolution) optimal action selection
  2. Basic idea is to use the assumption that actions taken at nearby points in time should be somewhat similar, so instead of deciding which action to take, the goal is to decide by how much the last action taken should be altered.
  3. The action can be altered by decreasing or increasing the number corresponding to the action.  This can be done repeatedly before an actual action is executed.  For example, at first ask if action should be increased or decreased.  Based on that answer half of the remaining action space is discarded, from there the same question can be asked again from the midpoint of the actions that are remaining in consideration, even before a single action is taken from that state.  That allows more than just one small single adjustment.
    1. This seems like a bit of a problem though – if the Q function at that state over all actions isn’t monotonic, how is it really supposed to know whether the action should be incremented or decremented, unless it has already explored all actions down to that low resolution?  And if you are doing that, I don’t see how the binary searching is really helping much?  They point out that at least this helps the search time down to logarithmic in the resolution, as opposed to linear
  4. Makes no smoothness assumptions (but likewise, there isn’t much guarantee as to the performance)
Tagged , ,

One thought on “Ideas from: Binary Action Search for Learning Continuous-Action Control Policies, Pazis and Lagoudakis

  1. Michael Littman says:

    yeah, I’m sort of skeptical of this approach for the reasons you give.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: