Ideas from Reinforcement Learning in Continuous Action Spaces through Sequential Monte Carlo Methods, Alessandro Lazaric, Andrea Bonarini, Marcello Restelli

Paper addresses a means to do RL in continuous action spaces via actor (has the policy) critic (has value function independent of policy) architecture
Goal is not to learn the optimum value function perfectly everywhere, but is instead to find the optimal policy
Actor takes an action, and is then criticized. Based on this, the actor modifies its policy by a stochastic gradient method on the policy space
Method proposed is the use of the sequential monte-carlo (SMC) method to approximate the sequence of probability distributions implemented by the actor, which they call SMC-Learning
Actions are initially selected by chance, but are resampled according to importance sampling, which has values based on the values computed by the critic
Because of monte-carlo sampling and Boltzman exploration, an accurate model can be built at the limit
Computational cost of action selection is logarithmic in the number of samples
There is a set of possible actions for each states, but this is adjusted via sampling so that the set of actions should eventually contain the optimal action for that state

One thought on “Ideas from Reinforcement Learning in Continuous Action Spaces through Sequential Monte Carlo Methods, Alessandro Lazaric, Andrea Bonarini, Marcello Restelli”

Michael Littman says:

August 9, 2009 at 5:13 pm

Again, author(s)? Also, apart from the contents of the paper, how do you see these ideas being useful in your work? What are the limitations that could lead to follow up papers?

Ari Weinstein's Research

Ideas from Reinforcement Learning in Continuous Action Spaces through Sequential Monte Carlo Methods, Alessandro Lazaric, Andrea Bonarini, Marcello Restelli

One thought on “Ideas from Reinforcement Learning in Continuous Action Spaces through Sequential Monte Carlo Methods, Alessandro Lazaric, Andrea Bonarini, Marcello Restelli”

Leave a comment

Ari Weinstein's Research

Ideas from Reinforcement Learning in Continuous Action Spaces through Sequential Monte Carlo Methods, Alessandro Lazaric, Andrea Bonarini, Marcello Restelli

Share this:

Related

One thought on “Ideas from Reinforcement Learning in Continuous Action Spaces through Sequential Monte Carlo Methods, Alessandro Lazaric, Andrea Bonarini, Marcello Restelli”

Leave a comment