- Discusses why control techniques that may be locally optimal are often not globally optimal
- As title suggests, algorithm proposed is called Information-Based A-B Search
- The homicidal chauffeur (HC) is similar to the cat and mouse game, and is in the class of
*differential games* - The version of HC here is slightly different from the normal version (one way is that moves are not simultaneous)
- Major differences between this version and classic A-B search is:
- Because of the formalism used which involves a delay function (which I don’t really understand) the utility function is computed with an extra step, seems minor
- There is a continuum of actions to be selected from
- There is also a difference between the normal formalism of finding next states
- The action selection method used over the continuum of actions is in the class of max-likelihood optimization techniques, called
*information-based optimization*

- The meat of the details is left to another paper
- Basic goal is that the algorithm starts with a target value. At each step of optimization the algorithm evaluates the point most likely to have the value given the history
- The version they use assumes parabolic curves in between sample points, so the result is probably similar in character to gaussian process optimization
- Even though the action selection is computationally expensive, more efficiently searching the tree can easily compensate for this
- In empirical results they bear this out, as search size increases
- They see nearly exponential savings in wall time as search breadth increases
*against random action selection* - They have the opposite result for search depth, propose using random deeper in the tree
- When facing uniform action discretization they see exponential savings over uniform action discretization
- It would be neat to put FS3+HOO against this algorithm.