Finite-Time Analysis of Kernelised Contextual Bandits
Michal Valko, Nathaniel Korda, Remi Munos, Ilias Flaounas, Nelo, Cristianini

TL;DR
This paper introduces KernelUCB, a kernelised algorithm for large-scale contextual bandits, providing improved finite-time regret bounds and unifying existing methods like GP-UCB.
Contribution
The paper proposes KernelUCB, a novel kernelised UCB algorithm with finite-time regret analysis, improving bounds over GP-UCB and matching the lower bounds in the linear case.
Findings
KernelUCB achieves better regret bounds than GP-UCB in the agnostic case.
For linear kernels, the regret bound matches the theoretical lower bound.
The analysis applies to large action sets with similarity information.
Abstract
We tackle the problem of online reward maximisation over a large finite set of actions described by their contexts. We focus on the case when the number of actions is too big to sample all of them even once. However we assume that we have access to the similarities between actions' contexts and that the expected reward is an arbitrary linear function of the contexts' images in the related reproducing kernel Hilbert space (RKHS). We propose KernelUCB, a kernelised UCB algorithm, and give a cumulative regret bound through a frequentist analysis. For contextual bandits, the related algorithm GP-UCB turns out to be a special case of our algorithm, and our finite-time analysis improves the regret bound of GP-UCB for the agnostic case, both in the terms of the kernel-dependent quantity and the RKHS norm of the reward function. Moreover, for the linear kernel, our regret bound matches the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Reinforcement Learning in Robotics · Smart Grid Energy Management
