- Algorithm develops options by discovering subgoals based on commonalities across multiple paths to a solution
- “The bottlenecks of interest are those that appear early and persist throughout learning”
- Trajectories are labeled as either positive (successful) or negative
- For non-episodic tasks, can use this method with some method of segmenting history into finite length trajectories
- Mentions a masters thesis (from one of the authors) that shows bad options slow learning
- Because of this when creating options you want to make sure what you add is actually helpful
- They have a particular means of doing their classification, but I don’t think its so important at the moment
- They have to do some tricks like omitting states near start state and goal otherwise they look important
- I suppose in the transfer setting where start and goal states change this isn’t an issue