Rollout Algorithms for Constrained Dynamic Programming. Bertsekas. LIDS 2005

<This is a tech report?>

  1. Based on the abstract, seems similar to, and his other work that deals with heuristics and rollouts
  2. “Under suitable assumptions, we show that if the base heuristic produces a feasible solution, the rollout algorithm also produces a feasible solution, whose cost is no worse than the cost correspoding to the base heuristic.”
  3. <Important!> “In the base heuristic is a DP policy, i.e., it generates state/control sequences via feedback functions μk, k=0,1,…,N-1, so that at any state xk, it applies the control μk(xk), then the algorithm can be viewed as a policy improvement step of the policy iteration method.  Based on the generic results for policy iteration, it then follows that the rollout algorithm’s cost <doing cost minimization in this setting> is no worse than the one of the base heuristic.  This cost improvement property has been extended in [BTW97 <Bertsekas, Tsitsiklis, Wu. Rollout algorithms for combinatorial optimiziation>] to base heuristics that are not DP policies, but rather satisfy an assumption called sequential improvement, which will be discussed shortly within the constrained context of this paper.”
  4. <Spring cleaning – Didn’t get to finish>

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: