Learning Task-Specific State Representations by Maximizing Slowness and Predictability. Jonschkowski, Brock. International Workshop on Evolutionary and Reinforcement Learning for Autonomous Robot Systems (ERLARS) 2013

Referenced from here: https://aresearch.wordpress.com/2015/04/17/autonomous-learning-of-state-representations-for-control-an-emerging-field-aims-to-autonomously-learn-state-representations-for-reinforcement-learning-agents-from-their-real-world-sensor-observations/

  1. As the title suggests, discusses setting up an NN that optimizes for predictability and slowness
  2. In addition to normal criteria for learned representation (allows rep of value function, allows original representation to be reproduced, allows for Markov property/prediction) they add a couple of other criteria:
    1.  Slowness
    2. Allows for transfer
  3. Consider fully observable tasks
  4. Also requires that representation is diverse (this exists in SFA to prevent trivial constant output)
  5. Formally, the cost function has 3 terms added together:
    1. Slowness (representation changes only a small amount between two subsequent steps)
    2. Diversity (states farther in time should be mapped farther apart)
    3. Transition function (must allow for accurate prediction of next state, given state, action)
  6. The representation learned according to this criteria is then used to learn the Q function
  7. The domain this is tested in is a racetrack domain -the rep is a graphical 10×10 overhead greyscale <doesnt seem so hard, but this is just a workshop paper>

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: