- Aside from floor contacts joint limits also introduce discontinuities
- As in Tom Erez’ work, they introduce stochasticity to smooth the problem so methods that require smoothness can be used
- “The moral is that optimal control of discontinuous dynamics cannot give acceptable results with a deterministic model”
- Results from HOO/HOLOP show that you can get extremely good results with discontinuous models, as long as the value function is smooth near the maximum
- He may be discussing continuous time though, which is a different setting

- “The modeling and simulation of multibody systems with contacts and friction is a broad and active field of research. One reasonable approach, not investigated here, is to model discontinuous phenomena with a continuous-time
*hybrid*dynamical system which undergoes switching. These methods require accurate resolution of collision and separation times, so fixed-time step integration with a fixed computational cost is impossible.” - This paper also performs local dynamic programming
- Mentions Tedrakes LQR trees as a way of measuring the size of basins of attraction for these local methods
- Criticizes another paper that develops a solution to deterministic race car driving, says that it is not good that the approach comes to close to failure modes such as the limit of adhesion because it isn’t robust. I disagree with that – if the problem is deterministic, you should take advantage of that.
- Limits of adhesion are used in that paper because it analyzes domains with discontinuities, and limits of adhesion are another example of that
- Unfortunately, the web page that describes the problem is gone so I can’t test against it

- In this paper they reuse rollouts by cutting off the beginning of the previous result and then just concatenating the last control on the end again
- Also use trajectory libraries here around the limit cycle
- The domain tested models a finger flicking a spinner – they come up with a limit cycle for the problem
- 40s of planning per step (after library and shifting of previous steps) on an 4-core i7
- The controller was robust to human manipluation of the domain on line
- When the noise in the model was made too small, the method sometimes failed to produce a productive limit cycle, or the trajectory optimizer did not converge
- They say one thing they would like to do is to compute solutions over the full distributions of the system as opposed to sampled points, but that introduces other problems
- The distributions become multimodal on impact, so gaussian approximations don’t work any more

- Also mentions the problems related to the method such as inadequate exploration, the initial controller is passive (no force), so the only reason it made contact at all was because gravity brought the finger in contact with the spinner
- Elipses allow the dynamics to be differentiable (with added noise), but some other shapes would prevent differentiability all together
- The equations used to control the policy mix quantities of different units in a manner such that chosing different units for either distances or impulses would lead to different policies, which is not desirable
- “in many ways it is easier to write down a numerical method for rigid-body dynamics than it is to say exactly what the method is trying to compute”