- Deals with sparse-sampling style algorithms
- Discusses lookahead pathologies, which is the case where (even with discounting, it seems), increasing the horizon H, while keeping the sample-width C constant eventually increases the error – this means a simple iterative deepening approach will eventually lead to poor decisions.
- I was familiar with this, but only in the context where tree-search is being performed and one (imperfect) evaluated value (not actual bounds) of the state is used in the leaf nodes.
- The have some demonstrations of this in action
- The idea in the paper is to make a good trade off between horizon and width by maintaining an estimation of the global sampling error on the tree.
- The estimation is a heuristic which indicates whether some state/actions need additional sampling
- Uses confidence intervals to estimate error of only the “finite sampling” (is this just the possible error on the expected value?). Assumes gaussian distribution
- The errors are also propagated through the tree