Information-Based Alpha-Beta Seach and the Homicidal Chauffeur. Todd W. Neller. Hybrid Systems: Computation and Control. 2002 | Ari Weinstein's Research

Information-Based Alpha-Beta Seach and the Homicidal Chauffeur. Todd W. Neller. Hybrid Systems: Computation and Control. 2002

Discusses why control techniques that may be locally optimal are often not globally optimal

As title suggests, algorithm proposed is called Information-Based A-B Search

The homicidal chauffeur (HC) is similar to the cat and mouse game, and is in the class of differential games

The version of HC here is slightly different from the normal version (one way is that moves are not simultaneous)

Major differences between this version and classic A-B search is:

Because of the formalism used which involves a delay function (which I don’t really understand) the utility function is computed with an extra step, seems minor

There is a continuum of actions to be selected from

There is also a difference between the normal formalism of finding next states

The action selection method used over the continuum of actions is in the class of max-likelihood optimization techniques, called information-based optimization

The meat of the details is left to another paper

Basic goal is that the algorithm starts with a target value. At each step of optimization the algorithm evaluates the point most likely to have the value given the history

The version they use assumes parabolic curves in between sample points, so the result is probably similar in character to gaussian process optimization

Even though the action selection is computationally expensive, more efficiently searching the tree can easily compensate for this

In empirical results they bear this out, as search size increases

They see nearly exponential savings in wall time as search breadth increases against random action selection

They have the opposite result for search depth, propose using random deeper in the tree

When facing uniform action discretization they see exponential savings over uniform action discretization

It would be neat to put FS3+HOO against this algorithm.

Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use.
To find out more, including how to control cookies, see here:
Cookie Policy