- Introduces a framework,
*Learning Depth-First Search.*- Naturally reduces to some existing algorithms, like IDA*, and MTD
- Yields some new algorithms

- “The LDFS framework is built around two simple notions: depth-first search and
*learning*in the sense of…[Barto, Bradtke & Singh], where state values are updated to make them consistent with the values of their successors.” - The LDFS framework shows that one of the main ideas behind most search algorithms is “the use of iterated depth-first searches with memory” for improving the accuracy of the value estimates for states until one is found that is optimal
- By coming up with a framework that easily allows algorithms to be recognized as a variant of some other algorithm, it becomes easier to infer what strengths and weaknesses an algorithm may have
- The framework relates IDA* to use in game tree search which relates it closely to MTD
- And/Or graphs?
- Framework functions in deterministic/stochastic, cyclic/acyclic, trees/game trees/andor trees
- As expected, values are not discounted, doing cost minimization
- Some algorithms require that the goal state be reachable from all other states with probability nonzero
- The assume that a heuristic value function exists that is a monotone lower bound (in our speak, upper bound for reward maximization), which means the heuristic is admissible
- Frame algorithms in terms of “find and revise,” where a node is selected for expansion by some method, and then value estimates are updated based on that expansion
- An algorithm is called Find-and-Revise[epsilon] if after each iteration of the algorithm, it either returns a state reachable from the start state with residual less than epsilon, or proves that no state with that property exists (that hasnt already been expanded?) and terminates

- They dont care about cycles as long as the cycles are not part of the solution