Lipshitz Bandits without the Lipshitz Constant. Bubeck, Stoltz, Yu. Algorithmic Learning Theory 2011.

  1. Generally, algorithms assume the Lipshitz constant is known.  This paper does not assume this.
  2. Shows it is possible to develop an algorithm that has regret that is minimax optimal in the globally Lipshitz setting without knowing the Lipshitz constants
  3. Functions in noisy domains
  4. Mention DIRECT, which doesn’t assume a Lipshitz constant, which is said to work well in practice, but only asymptotic convergence is guaranteed
  5. Says the discretization method used is similar to Kleinberg’s.  Then has 2 phases:
    1. Starts with uniform exploration to make some estimate of the Lipshitz constant (which is good enough to get the bounds)
    2. Find the optimal interval using a “standard” exploration-exploitation strategy
  6. Mentions this 2-step method may be useful for domains aside from bandits
  7. Basically assumptions are that f(x) and its derivative are Lipshitz in the hypercube

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: