Learning Depth-First Search: A Unified Approach to Heuristic Search in Deterministic and Non-Deterministic Settings, and its application to MDPs. Bonet, Geffner. ICAPS 2006 | Ari Weinstein's Research

Learning Depth-First Search: A Unified Approach to Heuristic Search in Deterministic and Non-Deterministic Settings, and its application to MDPs. Bonet, Geffner. ICAPS 2006

Introduces a framework, Learning Depth-First Search.

Naturally reduces to some existing algorithms, like IDA*, and MTD

Yields some new algorithms

“The LDFS framework is built around two simple notions: depth-first search and learning in the sense of…[Barto, Bradtke & Singh], where state values are updated to make them consistent with the values of their successors.”

The LDFS framework shows that one of the main ideas behind most search algorithms is “the use of iterated depth-first searches with memory” for improving the accuracy of the value estimates for states until one is found that is optimal

By coming up with a framework that easily allows algorithms to be recognized as a variant of some other algorithm, it becomes easier to infer what strengths and weaknesses an algorithm may have

The framework relates IDA* to use in game tree search which relates it closely to MTD

And/Or graphs?

Framework functions in deterministic/stochastic, cyclic/acyclic, trees/game trees/andor trees

As expected, values are not discounted, doing cost minimization

Some algorithms require that the goal state be reachable from all other states with probability nonzero

The assume that a heuristic value function exists that is a monotone lower bound (in our speak, upper bound for reward maximization), which means the heuristic is admissible

Frame algorithms in terms of “find and revise,” where a node is selected for expansion by some method, and then value estimates are updated based on that expansion

An algorithm is called Find-and-Revise[epsilon] if after each iteration of the algorithm, it either returns a state reachable from the start state with residual less than epsilon, or proves that no state with that property exists (that hasnt already been expanded?) and terminates

They dont care about cycles as long as the cycles are not part of the solution