The Roamer Planner Random-Walk Assisted Best-First Search. IPPC Competition 2011.

<Achieved 5th of 27 planners in the area it competed in the IPPC>

  1. Motivation is that when doing best-first search, algorithm may hit plateaus, where the believed best solution doesn’t change for a long period of time
    1. Is a problem in hardcore search/optimization like SAT/CSP
  2. Goal then, is to find what is called an “exit state,” which is a state that will cause a change in the heuristic value
  3. The algorithm presented here, ROAMER, uses random search to help find an exit state when a plateau is hit
  4. Description contains mention of some search methods I don’t know about <more of use to SAT/CSP community than RL probably>, but seems to be based on an iterative version of A*
  5. Cite a paper that introduces MC random exploration for search
  6. Basically these style planners do some form of heuristic BFS
  7. In the case that the heuristic is perfect, a planner will expand only the path from start to goal, but even a small error in the heuristic function it may have to expand exponential number of states to find goal
  8. In the formulation heuristic values are positive and decrease as search deepens until the goal is found with value 0
    1. In cases considered, the heuristic values of s, s’ (its successor) are the same, so there is no “gradient”
  9. <The premise seems a little ironic here – a standard claim in optimization is that when you reach a plateau, there is no gradient to follow, and then the optimization enters a random walk and you are in trouble.  Here, they say when you reach a plateau, get out of it through a random walk.>
  10. Reasons to do random walk:
    1. Better at jumping out of local minima
    2. Doesn’t require use of heuristic function, so can be cheaper to run
    3. Basically memory free, so does not add to cost of bestFS
  11. In SAT/CSP heuristic depends on the number of features not satisfied; in most cases changing the statement won’t satisfy an additional feature, so the heuristic value of both states are equivalent
  12. Mention tabu search and WalkSAT <only familiar with the first>
  13. When dealing with problems that have more helpful heuristics the random search is less useful.
  14. Best-first drives the search, but random search is invoked when deemed useful
  15. Basically if a plateau is detected (when to report a plateau is a parameter), does random walks (to a finite horizon) until a change in heuristic value occurs

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: