On Kernelized Multi-armed Bandits
Sayak Ray Chowdhury, Aditya Gopalan

TL;DR
This paper introduces two Gaussian process-based algorithms for continuous stochastic bandit problems, providing theoretical regret bounds and demonstrating their effectiveness through experiments on synthetic and real-world data.
Contribution
The paper proposes improved GP-UCB and GP-Thomson sampling algorithms with regret bounds for continuous bandits, and introduces a new concentration inequality for vector-valued martingales.
Findings
Proposed algorithms outperform existing methods in various environments.
Regret bounds are established for functions in the RKHS associated with the kernel.
Experimental results show significant gains in real-world applications.
Abstract
We consider the stochastic bandit problem with a continuous set of arms, with the expected reward function over the arms assumed to be fixed but unknown. We provide two new Gaussian process-based algorithms for continuous bandit optimization-Improved GP-UCB (IGP-UCB) and GP-Thomson sampling (GP-TS), and derive corresponding regret bounds. Specifically, the bounds hold when the expected reward function belongs to the reproducing kernel Hilbert space (RKHS) that naturally corresponds to a Gaussian process kernel used as input by the algorithms. Along the way, we derive a new self-normalized concentration inequality for vector- valued martingales of arbitrary, possibly infinite, dimension. Finally, experimental evaluation and comparisons to existing algorithms on synthetic and real-world environments are carried out that highlight the favorable gains of the proposed strategies in many…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Gaussian Processes and Bayesian Inference · Smart Grid Energy Management
MethodsGaussian Process
