Monte-Carlo Planning in Large POMDPs. David Silver, Joel Veness. Nips 2010.


  1. Deals with how to get MCTS (specifically UCT) working in POMDPs
  2. Maintains belief state with a particle filter
  3. The tree maintained by UCT is modified slightly
    1. Counts are on hist0ry (action and observations over time).
    2. History maps to the belief state
  4. The method used by UCT here is the unofficial version that builds an additional node onto the tree after each iteration
  5. Discuss convergence of UCT and this version for POMDPs, but don’t mention that sometimes actual learning is intractable
  6. The empirical results are impressive.  In smaller domains more exact methods slightly outperform PO-UCT, but PO-UCT is extremely cheap to run in comparison (multiple orders of magnitude).
  7. Some of the details (especially related to the particle filter) are omitted, but the source code was released
  8. Seems like there is very little literature on POMDPs and MCTS.  The stuff here is nice, although more details on the particle filter would be helpful
Advertisements

One thought on “Monte-Carlo Planning in Large POMDPs. David Silver, Joel Veness. Nips 2010.

  1. Sergiu says:

    “Seems like there is very little literature on POMDPs and MCTS.” – and I think we should do something about that.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: