- Paper presents a way to optimally (and in closed-form) solve continuous MDPs, does Symbolic Dynamic Programming (SPD)
- Functions in multivariate continuous state and action, discrete noise (whats that mean in this case?), piecewise linear reward.
- There is limited work that deals with exact solutions. In control, there is LQR/LQG (linear quadratic with gaussian noise), but in LQR transition dynamics or reward cannot be piecewise linear
- The paper deals with 2 domains:
**Mars Rover**,**Inventory Control**, and**Resivior Management** - States can be composed of discrete and continuous factors
- (not surprisingly) requires a complete model of the MDP
- Because CSA-MDPS (continuous state action mdps) are naturally factored (there is a reference for this) , they can be dealt with as a dynamic bayes net (DBN)
- Assume no synchronic arcs (all dependencies are strictly forward in time), which allows for a relatively simple description of the transition dynamics

- The solutions here are
**optimal discounted finite-horizon** - The conditional probability functions are described in terms of piecewise linear equations with the following properties:
- Conditioned only on action, current state, and prev state vars (this is weird, why current state?)
- The are deterministic (can be encoded with dirac deltas), this means gaussian noise in the dynamics, for example is no good.
- They are piecewise linear

- Can sort of handle quadratic rewards, but this is restricted a bit (has to be univariate quadratic?)
- Describe value iteration for continuous MDPs
- Dynamics get represented as a bunch of disjoint linear and quadratic functions
- <its so cold in this room that I can’t even concentrate on what I’m reading, probably missing stuff here>
- Integration is required, but it turns out that in this setting it is solvable in a particular closed form, so it isn’t an issue
**This is very cool and important, but a general point to make is that this does exact solving, so it is brittle in a way. A region that isn’t linear or quadratic just ends up being unsolvable. Methods that are less exact can deal with that sort of problem****The max operator is commutative? What is the casemax operator?**- It is claimed that the solver is “efficient”, but I’d like to know what that means in particular
- Solutions require linear checking/pruning in the XADD (I think thats the structure that represents the solution) or problems blow up extremely quickly
- Claims here that this approach produces the first exact solution for a couple of canonical OR problems. A very neat paper, I wish I understood it better AAAI 2012 seems to have the very prominent talks on videolectures but thats it.

Advertisements
(function(g,$){if("undefined"!=typeof g.__ATA){
g.__ATA.initAd({collapseEmpty:'after', sectionId:26942, width:300, height:250});
g.__ATA.initAd({collapseEmpty:'after', sectionId:114160, width:300, height:250});
}})(window,jQuery);
var o = document.getElementById('crt-823952685');
if ("undefined"!=typeof Criteo) {
var p = o.parentNode;
p.style.setProperty('display', 'inline-block', 'important');
o.style.setProperty('display', 'block', 'important');
Criteo.DisplayAcceptableAdIfAdblocked({zoneid:388248,containerid:"crt-823952685",collapseContainerIfNotAdblocked:true,"callifnotadblocked": function () {var o = document.getElementById('crt-823952685'); o.style.setProperty('display','none','important');o.style.setProperty('visbility','hidden','important'); } });
} else {
o.style.setProperty('display', 'none', 'important');
o.style.setProperty('visibility', 'hidden', 'important');
}
var o = document.getElementById('crt-49859679');
if ("undefined"!=typeof Criteo) {
var p = o.parentNode;
p.style.setProperty('display', 'inline-block', 'important');
o.style.setProperty('display', 'block', 'important');
Criteo.DisplayAcceptableAdIfAdblocked({zoneid:837497,containerid:"crt-49859679",collapseContainerIfNotAdblocked:true,"callifnotadblocked": function () {var o = document.getElementById('crt-49859679'); o.style.setProperty('display','none','important');o.style.setProperty('visbility','hidden','important'); } });
} else {
o.style.setProperty('display', 'none', 'important');
o.style.setProperty('visibility', 'hidden', 'important');
}