Optimal Order Simple Regret for Gaussian Process Bandits
Sattar Vakili, Nacime Bouziani, Sepehr Jalali, Alberto Bernacchia,, Da-shan Shiu

TL;DR
This paper derives a tighter, order-optimal bound on simple regret for Gaussian Process bandit algorithms, advancing understanding of exploration efficiency in non-convex, expensive optimization tasks.
Contribution
It introduces a new analysis that tightens the bounds on simple regret, including novel confidence intervals for GP models in RKHS.
Findings
Achieves an $ ilde{O}(\sqrt{rac{\gamma_N}{N}})$ simple regret bound
Proves the bound is order optimal up to logarithmic factors
Develops novel confidence intervals for GP models in RKHS
Abstract
Consider the sequential optimization of a continuous, possibly non-convex, and expensive to evaluate objective function . The problem can be cast as a Gaussian Process (GP) bandit where lives in a reproducing kernel Hilbert space (RKHS). The state of the art analysis of several learning algorithms shows a significant gap between the lower and upper bounds on the simple regret performance. When is the number of exploration trials and is the maximal information gain, we prove an bound on the simple regret performance of a pure exploration algorithm that is significantly tighter than the existing bounds. We show that this bound is order optimal up to logarithmic factors for the cases where a lower bound on regret is known. To establish these results, we prove novel and sharp confidence intervals for GP models applicable to RKHS…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Gaussian Processes and Bayesian Inference · Advanced Multi-Objective Optimization Algorithms
MethodsGaussian Process
