Stochastic Complementarity for Local Control of Discontinuous Dynamics. Tassa, Todorov. RSS 2010

Aside from floor contacts joint limits also introduce discontinuities
As in Tom Erez’ work, they introduce stochasticity to smooth the problem so methods that require smoothness can be used
“The moral is that optimal control of discontinuous dynamics cannot give acceptable results with a deterministic model”
1. Results from HOO/HOLOP show that you can get extremely good results with discontinuous models, as long as the value function is smooth near the maximum
2. He may be discussing continuous time though, which is a different setting
“The modeling and simulation of multibody systems with contacts and friction is a broad and active field of research. One reasonable approach, not investigated here, is to model discontinuous phenomena with a continuous-time hybrid dynamical system which undergoes switching. These methods require accurate resolution of collision and separation times, so fixed-time step integration with a fixed computational cost is impossible.”
This paper also performs local dynamic programming
Mentions Tedrakes LQR trees as a way of measuring the size of basins of attraction for these local methods
Criticizes another paper that develops a solution to deterministic race car driving, says that it is not good that the approach comes to close to failure modes such as the limit of adhesion because it isn’t robust. I disagree with that – if the problem is deterministic, you should take advantage of that.
1. Limits of adhesion are used in that paper because it analyzes domains with discontinuities, and limits of adhesion are another example of that
2. Unfortunately, the web page that describes the problem is gone so I can’t test against it
In this paper they reuse rollouts by cutting off the beginning of the previous result and then just concatenating the last control on the end again
Also use trajectory libraries here around the limit cycle
The domain tested models a finger flicking a spinner – they come up with a limit cycle for the problem
40s of planning per step (after library and shifting of previous steps) on an 4-core i7
The controller was robust to human manipluation of the domain on line
When the noise in the model was made too small, the method sometimes failed to produce a productive limit cycle, or the trajectory optimizer did not converge
They say one thing they would like to do is to compute solutions over the full distributions of the system as opposed to sampled points, but that introduces other problems
1. The distributions become multimodal on impact, so gaussian approximations don’t work any more
Also mentions the problems related to the method such as inadequate exploration, the initial controller is passive (no force), so the only reason it made contact at all was because gravity brought the finger in contact with the spinner
Elipses allow the dynamics to be differentiable (with added noise), but some other shapes would prevent differentiability all together
The equations used to control the policy mix quantities of different units in a manner such that chosing different units for either distances or impulses would lead to different policies, which is not desirable
1. “in many ways it is easier to write down a numerical method for rigid-body dynamics than it is to say exactly what the method is trying to compute”

Ari Weinstein's Research

Stochastic Complementarity for Local Control of Discontinuous Dynamics. Tassa, Todorov. RSS 2010

Leave a comment

Ari Weinstein's Research

Stochastic Complementarity for Local Control of Discontinuous Dynamics. Tassa, Todorov. RSS 2010

Share this:

Related

Leave a comment