- Considers case of simple regret, where exploration and exploitation are separated
- In the setting, the agent is allowed
*n*pulls of arms and at the end must recommend an arm. The agent is not necessarily aware of the value of*n*. - Costs during exploration are not in terms of regret, but some other resource, such as computation time
- May also require anytime performance
- The smaller the cumulative regret, the larger the simple regret
- Related to exploration/exploitation tradeoff
- Simple regret can be upper bounded in terms of cumulative regret
- For finite-armed bandits, best simple regret is achieved by sampling each arm a linear number of times, whereas in cumulative regret suboptimal arms should not be sampled more than a logarithmic number of times

- The statement that each arm should get linear number of pulls for best simple regret is true only in case of large
*n*. For smaller values other techniques will yield better results (UCB type is better) - Rate of decrease of simple regret is 1/sqrt(n) in distribution free case, in distribution dependent case the decrease is exponential. I’m not sure what the difference is here though.
- In continuous spaces the regret bound seems to be O(
*n*) which i dont understand

Advertisements
(function(g,$){if("undefined"!=typeof g.__ATA){
g.__ATA.initAd({collapseEmpty:'after', sectionId:26942, width:300, height:250});
g.__ATA.initAd({collapseEmpty:'after', sectionId:114160, width:300, height:250});
}})(window,jQuery);
var o = document.getElementById('crt-1598769752');
if ("undefined"!=typeof Criteo) {
var p = o.parentNode;
p.style.setProperty('display', 'inline-block', 'important');
o.style.setProperty('display', 'block', 'important');
Criteo.DisplayAcceptableAdIfAdblocked({zoneid:388248,containerid:"crt-1598769752",collapseContainerIfNotAdblocked:true,"callifnotadblocked": function () {var o = document.getElementById('crt-1598769752'); o.style.setProperty('display','none','important');o.style.setProperty('visbility','hidden','important'); } });
} else {
o.style.setProperty('display', 'none', 'important');
o.style.setProperty('visibility', 'hidden', 'important');
}
var o = document.getElementById('crt-1916574585');
if ("undefined"!=typeof Criteo) {
var p = o.parentNode;
p.style.setProperty('display', 'inline-block', 'important');
o.style.setProperty('display', 'block', 'important');
Criteo.DisplayAcceptableAdIfAdblocked({zoneid:837497,containerid:"crt-1916574585",collapseContainerIfNotAdblocked:true,"callifnotadblocked": function () {var o = document.getElementById('crt-1916574585'); o.style.setProperty('display','none','important');o.style.setProperty('visbility','hidden','important'); } });
} else {
o.style.setProperty('display', 'none', 'important');
o.style.setProperty('visibility', 'hidden', 'important');
}