- Approach is concerned with dynamic walking on uneven terrain
- A common approach is to formulate the walking problem as a linear system which can be solved with a number of methods. This is problematic because it reduces energy efficiency, overly constrains the type of gaits that can be found, and requires constant compensation for what the body naturally does (that introduces non linearity) which introduces other forms of complexity
- Dynamic walkers utilize, instead of overcome the inherent nonlinearities involved in legged locomotion
- The idea is to only introduce small corrections into the trajectories that will naturally occur in the system
- Discuss motivation based on studies of how animals walk, and how humans learn to walk.
- Cost function is linear in the actions, but can be piecewise linear
- Solutions are nonstationary finite horizon
- Set up the Bellman eq as a linear program so that it can handle actions w/100s to 10,000s dimensions (not yet sure how this is different from linear programming that can be done in the vanilla setting)
- A previous paper (uses a similar approach) and solves a domain with 20,000 dimensional action space for 8760 time steps (a year). Thats crazy.
- They then introduce a pure exploitation algorithm called SPAR-Storage. It constructs a concave piecewise linear approximation of the value function.
- At the limit, the approximation and true value function match at the optimum
- All pieces begin with zero slope and zero value and are iteratively improved
- Even for very large states in terms of energy storage (tens of thousands), the algorithm converged in about 100 iterations. On the other hand, the algorithm is sensitive to the size of the rest of the state vector (only a handful of dimensions is practical), although they present some ideas as to how to improve it
- There is a convergence proof that I am not reading carefully right now