Deals with developing policies in very high dimensional state spaces

Proposes a linear dimensionality reduction (projection?) algorithm that discovers predictive projections

A predictive projection is a prediction of future states by nearest-neighbor learning

Consider robotics where state information is camera input – this raw state is too high D to work in natively

Want something that can reduce the dimension planning has to take place it

In the projected space, the idea is that the same action in two similar projected states should produce a similar outcome

Work here is based on Gradient-based distance metric learning

More specifically, based on something called Neighborhood Components Analysis (NCA) which minimizes error for nearest-neighbor classification

Projection is made to maintain accuracy on estimates of future states.

Problem is that because it is a least-squares (I think) metric, noise and outliers cause problems, so another trick has to be used

“The predictive projections algorithm as described above may not perform well in cases where the effects of different actions are restricted to specific state dimensions.”

They use LSPI to do learn the policy, test on Lagoudakis’ pendulum

Other stuff, skipping. I think I found the wrong paper…