Gaussian Process Upper Confidence Bound Achieves Nearly-Optimal Regret in Noise-Free Gaussian Process Bandits
Shogo Iwazaki

TL;DR
This paper proves that the GP-UCB algorithm achieves nearly optimal regret bounds in noise-free Gaussian process bandit problems, matching its strong empirical performance with theoretical guarantees.
Contribution
It establishes the first nearly optimal regret bounds for noise-free GP-UCB, including constant cumulative regret for common kernels.
Findings
GP-UCB attains nearly optimal regret bounds in noise-free settings.
First theoretical proof of constant cumulative regret for squared exponential and Matérn kernels.
Bridges the gap between empirical success and theoretical analysis of GP-UCB.
Abstract
We study the noise-free Gaussian Process (GP) bandits problem, in which the learner seeks to minimize regret through noise-free observations of the black-box objective function lying on the known reproducing kernel Hilbert space (RKHS). Gaussian process upper confidence bound (GP-UCB) is the well-known GP-bandits algorithm whose query points are adaptively chosen based on the GP-based upper confidence bound score. Although several existing works have reported the practical success of GP-UCB, the current theoretical results indicate its suboptimal performance. However, GP-UCB tends to perform well empirically compared with other nearly optimal noise-free algorithms that rely on a non-adaptive sampling scheme of query points. This paper resolves this gap between theoretical and empirical performance by showing the nearly optimal regret upper bound of noise-free GP-UCB. Specifically, our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Gaussian Processes and Bayesian Inference · Domain Adaptation and Few-Shot Learning
