Automatic Discovery of Subgoals in Reinforcement Learning using Diverse Density. McGovern, Barto. ICML 2001

  1. Algorithm develops options by discovering subgoals based on commonalities across multiple paths to a solution
  2. “The bottlenecks of interest are those that appear early and persist throughout learning”
  3. Trajectories are labeled as either positive (successful) or negative
  4. For non-episodic tasks, can use this method with some method of segmenting history into finite length trajectories
  5. Mentions a masters thesis (from one of the authors) that shows bad options slow learning
    1. Because of this when creating options you want to make sure what you add is actually helpful
  6. They have a particular means of doing their classification, but I don’t think its so important at the moment
  7. They have to do some tricks like omitting states near start state and goal otherwise they look important
    1. I suppose in the transfer setting where start and goal states change this isn’t an issue

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: