Bayesian Inference in MCTS. Tesauro, Rajan, Segal. UAI 2010.

  1. Is inspired by distribution-free (frequentist) approaches like UCT but uses Bayesian Inference to perform estimates of values
  2. Bayesian approach allows alg designers to leverage evaluation functions differently
  3. When assumptions hold, it is Bayes-optimal 
  4. Stochastic trial results at leaf nodes are combined with the prior to create posterior estimates of value in the nodes that are then propagated upwards, which occurs recursively
  5. For MDPs, distributions are based on a distributional max operator (there is a corresponding min for game trees)
    1. Likewise, there are distribution based upper-confidence bounds
  6. Bayesian estimates allow for faster convergence with limited data compared to UCT
  7. A paper is cited that says average value estimates are (sample) inefficient ad often underestimate the true value
  8. Computation costs are within an order of magnitude of UCT
    1. Because the approach is more sample efficient, it is advantageous in cases where sampling is expensive (such as Go)
  9. Bayesian MCTS is robust to sampling policies.  Convergence of UCT depends on focusing majority of trials on the optimal path
    1. Bayesian MCTS should better leverage parallelization / off policy sampling
  10. Max/min values are based on extremum distributions.  Whats that?
    1. It is a set of random variables
    2. Min, max values are then computed based on this set.  Can be done by integration or Gaussian approximation
  11. The modifications to UCT proposed are very simple
  12.  Gaussian approximations are nice because they are very nice.   Also, some simple calculation can take into account correlation of sibling leaf nodes
  13. Looks like worst-case scenarios that are possible when using Gaussian approximation don’t come up in practice
  14. The empirical domains are very simple, but reasonable

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: