- Concerned with search in deterministic continuous state and action spaces, time is discrete
- Deals with goal-finding in the presence of a heuristic function, paths must be bounded on their length,
*d* - Getting exactly optimal results in continuous domains is only possible at the limit, so instead planning is done to epsilon-accuracy
- If the goal state cannot be found in
*d*steps, it will return the first*d*steps of a path that may reach the goal optimally, as well as its lower bound, which can be done because of the heuristic - Argue that use of forward search allows for more efficient algorithms because of pruning
- Discuss algorithms from the motion planning literature: Probabalistic Roadmaps, and Rapidly Exploring Random Trees
- Issues here are motion planning is only concerned with finding a path from start to goal, and does not care about optimality
- Additionally, motion planning algorithms assume continuous time, and when time is discrete the constraints are non-differentiable

- In general, the literature on continuous state, action, discrete time is very limited
- Transitions, cost, and heuristic functions must be Lipshitz
- Algorithm basically does Lipshitz optimization over the cost of paths starting from the range of actions at each node in the rollout tree
- Samples are done in a rollout-manner, as opposed to a depth-first blowout of the samples, so
*M*, the action -> cost function for each node is updated for all nodes encountered during a rollout after it completes - They discuss how to do the updates based on the Lipshitz constraints in higher dimensions than 1, where there resulting constraints are an envelope of hyper-cones, but they estimate this with rectangles, which is what I have seen in the literature on pure Lipshitz optimization. This can be done in a way that is always pessimistic, so it doesn’t really mess up the results (wont prevent an epslion-optimal result to be found, but it may take more samples than if cones were actually used)
- Guarantees (with proofs):
- The algorithm terminates
- At termination, it returns a lower bound on the cost of any plan
- If the plan returned is complete, it is at most epsilon worse than optimal
- If the plan returned is partial, the beginning of the plan returned in the best case will be epsilon worse than optimal

- The paper is primarily theoretical, but gives fair space to how it may actually be implemented (for example, approximating cones with rectangles)
- Discuss importance of extending to to stochastic setting)

Advertisements