Stochastic Complementarity for Local Control of Discontinuous Dynamics. Tassa, Todorov. RSS 2010

  1. Aside from floor contacts joint limits also introduce discontinuities
  2. As in Tom Erez’ work, they introduce stochasticity to smooth the problem so methods that require smoothness can be used
  3. “The moral is that optimal control of discontinuous dynamics cannot give acceptable results with a deterministic model”
    1. Results from HOO/HOLOP show that you can get extremely good results with discontinuous models, as long as the value function is smooth near the maximum
    2. He may be discussing continuous time though, which is a different setting
  4. “The modeling and simulation of multibody systems with contacts and friction is a broad and active field of research.  One reasonable approach, not investigated here, is to model discontinuous phenomena with a continuous-time hybrid dynamical system which undergoes switching.  These methods require accurate resolution of collision and separation times, so fixed-time step integration with a fixed computational cost is impossible.”
  5. This paper also performs local dynamic programming
  6. Mentions Tedrakes LQR trees as a way of measuring the size of basins of attraction for these local methods
  7. Criticizes another paper that develops a solution to deterministic race car driving, says that it is not good that the approach comes to close to failure modes such as the limit of adhesion because it isn’t robust.  I disagree with that – if the problem is deterministic, you should take advantage of that.
    1. Limits of adhesion are used in that paper because it analyzes domains with discontinuities, and limits of adhesion are another example of that
    2. Unfortunately, the web page that describes the problem is gone so I can’t test against it
  8. In this paper they reuse rollouts by cutting off the beginning of the previous result and then just concatenating the last control on the end again
  9. Also use trajectory libraries here around the limit cycle
  10. The domain tested models a finger flicking  a spinner – they come up with a limit cycle for the problem
  11. 40s of planning per step (after library and shifting of previous steps) on an 4-core i7
  12. The controller was robust to human manipluation of the domain on line
  13. When the noise in the model was made too small, the method sometimes failed to produce a productive limit cycle, or the trajectory optimizer did not converge
  14. They say one thing they would like to do is to compute solutions over the full distributions of the system as opposed to sampled points, but that introduces other problems
    1. The distributions become multimodal on impact, so gaussian approximations don’t work any more
  15. Also mentions the problems related to the method such as inadequate exploration, the initial controller is passive (no force), so the only reason it made contact at all was because gravity brought the finger in contact with the spinner
  16. Elipses allow the dynamics to be differentiable (with added noise), but some other shapes would prevent differentiability all together
  17. The equations used to control the policy mix quantities of different units in a manner such that chosing different units for either distances or impulses would lead to different policies, which is not desirable
    1. “in many ways it is easier to write down a numerical method for rigid-body dynamics than it is to say exactly what the method is trying to compute”

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: