Qualitative Hybrid Control of Dynamic Bipedal Walking. Ramamoorthy, Kuipers.

Approach is concerned with dynamic walking on uneven terrain

A common approach is to formulate the walking problem as a linear system which can be solved with a number of methods. This is problematic because it reduces energy efficiency, overly constrains the type of gaits that can be found, and requires constant compensation for what the body naturally does (that introduces non linearity) which introduces other forms of complexity

Dynamic walkers utilize, instead of overcome the inherent nonlinearities involved in legged locomotion

The idea is to only introduce small corrections into the trajectories that will naturally occur in the system

Discuss motivation based on studies of how animals walk, and how humans learn to walk.

Cost function is linear in the actions, but can be piecewise linear

Solutions are nonstationary finite horizon

Set up the Bellman eq as a linear program so that it can handle actions w/100s to 10,000s dimensions (not yet sure how this is different from linear programming that can be done in the vanilla setting)

A previous paper (uses a similar approach) and solves a domain with 20,000 dimensional action space for 8760 time steps (a year). Thats crazy.

They then introduce a pure exploitation algorithm called SPAR-Storage. It constructs a concave piecewise linear approximation of the value function.

At the limit, the approximation and true value function match at the optimum

All pieces begin with zero slope and zero value and are iteratively improved

Even for very large states in terms of energy storage (tens of thousands), the algorithm converged in about 100 iterations. On the other hand, the algorithm is sensitive to the size of the rest of the state vector (only a handful of dimensions is practical), although they present some ideas as to how to improve it

There is a convergence proof that I am not reading carefully right now