Model-Based RL for Evolving Soccer Strategies. Wiering, Salustowicz, Schmidhuber

From  Computational Intelligence in Games, 2001

  1. Paper studies model building. Claim in abstract is even that incomplete models can help find good policies
  2. Approach is a combination of CMACs and prioritized sweeping
  3. They didn’t use the official Robocup simulator because complexity made evaluation of approaches difficult, so they spun their own.
    1. Their version is drastically simplified, which almost certainly makes model learning easy
  4. This is full soccer so reward is 1 when a goal is scored
  5. They claim that in continuous spaces, model learning is most successful when building local function approximators (but they dont explain that adequately here)
  6. Looks like they build a different model for each tiling?
  7. They have to do some weird hacks to get agents to be able to share data in a way that doesnt ruin the policies
    1. They cite MBIE as something similar to what they aren’t doing, but the connection isn’t exactly right
  8. They have to do random restarts on policies before they arrive at something that can be learned from
  9. They compare Q(lambda) to PIPE, probabilistic incremental program evolution
  10. It looks like they give PIPE 5x as much training data?
  11. The “CMAC model” algorithm is best, and PIPE is worst, regular CMAC in the middle

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: