Optimal Learning for Sequential Sampling with Non-parametric Beliefs. Barut, Powell. Journal of Global Optimization?


[this is based on a quick 2nd read through, so less notes than usual]

  1. Aggregates over a set of kernel functions in order to do ranking and selection more effectively 
  2. The weighing scheme used is shown to be optimal under independent kernel estimators (need to find out what that means exactly)
  3. Uses knowledge gradient to select sampling points (roughly uses GPs and then estimates gradient of the bounds, and then attempts to sample where the gradient is highest, but a separate optimization algorithm is needed to find those locations)
  4. Proposed policy is shown to be asymptotically optimal
  5. This paper is concerned with searching over a finite number of options (computationally, restricted to some thousands)
  6. Similarity is measured by a kernel.  Although there is no explicit use of lipshitz smoothness or concavity, kernel metrics must be Holder
  7. In the off line setting, the problem addressed is called ranking and selection, but in on-line its called multi armed bandit
  8. References for earliest papers on stochastic search
  9. Noise is assumed to be normally distributed
  10. Policy is 1-step optimal, and converges to optimal at the limit
  11. In their empirical results they use 0-mean 1D gaussian processes.  They test this method, random sampling, and another GP-based method (which attempts to find the length-scale online (SKO)? theirs just weighs a bunch of provided kernels, finding the best one)
    1. For differing kernels, used linear fits but with different bandwidths
    2. Their stuff does very well, although with larger length-scales SKO works better.  For GPs that actually have very short length-scales, KGCB is better.
  12. Also, SKO is less general, and only works well on certain classes of functions (certain covariance functions cause trouble, for example).  Seems to be a little more fragile.
  13. I respect them for pointing out situations where SKO outperforms KGCB
  14. In an energy storage/retrieval problem, their method outperforms SKO, although with larger amounts of noise the bounds begin to overlap.  In this problem KGCB converged more quickly, while in some of the earlier results SKO converged more rapidly, so depends on the domain.
  15. “Although our policy performs very well in the numerical experiments, there is a caveat.  Kernel estimation is known to suffer from the curse of dimensionality as the MSE proportional to h^d where h is the bandwidth and d is the number of dimensions.  If observations lie in high dimensional spaces, non-parametric estimation is known to have poor performance.  Because of these reasons, the efficiency of our estimation method also degenerates in higher dimensions.  Additive models might be used to handle this curse but this requires making more assumptions on the structure of the function.”
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: