Using Eligibility Traces to Find the Best Memoryless Policy in Partially Observable MDPs. John Loch, Satinder Singh.

  1. Basically uses eligibility traces to find policies in POMDPs; previous methods mainly used memory to estimate state, which gets computationally expensive
  2. Results are relevant to: “POMDPs that have good memoryless policies, i.e., on problems in which there may well be very poor observability but there also exists a mapping from the agent’s immediate observations to actions that yield near-optimal return.”
  3. Paper is almost entirely empirical results

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: