Using Eligibility Traces to Find the Best Memoryless Policy in Partially Observable MDPs. John Loch, Satinder Singh.


  1. Basically uses eligibility traces to find policies in POMDPs; previous methods mainly used memory to estimate state, which gets computationally expensive
  2. Results are relevant to: “POMDPs that have good memoryless policies, i.e., on problems in which there may well be very poor observability but there also exists a mapping from the agent’s immediate observations to actions that yield near-optimal return.”
  3. Paper is almost entirely empirical results
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: