Adversarial Contextual Bandits Go Kernelized

Gergely Neu; Julia Olkhovskaya; Sattar Vakili

arXiv:2310.01609·stat.ML·October 4, 2023

Adversarial Contextual Bandits Go Kernelized

Gergely Neu, Julia Olkhovskaya, Sattar Vakili

PDF

Open Access

TL;DR

This paper extends adversarial linear contextual bandits to kernelized loss functions, proposing an efficient algorithm with near-optimal regret bounds that adapt to different eigenvalue decay rates of the kernel.

Contribution

It introduces a new kernelized adversarial bandit algorithm with a novel loss estimator, achieving near-optimal regret under various eigenvalue decay assumptions.

Findings

01

Regret bound of O(KT^{(1/2)(1+1/c)}) for polynomial eigendecay

02

Regret bound of O(\u221a{T}) for exponential eigendecay

03

Matches known lower bounds and improves upon previous bounds in kernelized adversarial bandits

Abstract

We study a generalization of the problem of online learning in adversarial linear contextual bandits by incorporating loss functions that belong to a reproducing kernel Hilbert space, which allows for a more flexible modeling of complex decision-making scenarios. We propose a computationally efficient algorithm that makes use of a new optimistically biased estimator for the loss functions and achieves near-optimal regret guarantees under a variety of eigenvalue decay assumptions made on the underlying kernel. Specifically, under the assumption of polynomial eigendecay with exponent $c > 1$ , the regret is $O (K T^{\frac{1}{2} (1 + \frac{1}{c})})$ , where $T$ denotes the number of rounds and $K$ the number of actions. Furthermore, when the eigendecay follows an exponential pattern, we achieve an even tighter regret bound of $O (T)$ . These rates match the lower bounds…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Adversarial Robustness in Machine Learning