Open Problem: Tight Bounds for Kernelized Multi-Armed Bandits with   Bernoulli Rewards

Marco Mussi; Simone Drago; Alberto Maria Metelli

arXiv:2407.06321·stat.ML·July 10, 2024

Open Problem: Tight Bounds for Kernelized Multi-Armed Bandits with Bernoulli Rewards

Marco Mussi, Simone Drago, Alberto Maria Metelli

PDF

Open Access

TL;DR

This paper investigates the challenge of establishing tight bounds for kernelized bandit algorithms when rewards are Bernoulli-distributed, a less-explored setting compared to subgaussian noise models, highlighting an open problem in online learning.

Contribution

It identifies and emphasizes the open problem of deriving tight bounds for kernelized bandits with Bernoulli rewards, contrasting with existing subgaussian noise models.

Findings

01

Highlights the open problem in kernelized bandits with Bernoulli rewards.

02

Contrasts Bernoulli rewards with subgaussian noise models.

03

Draws attention to the need for theoretical advancements in this area.

Abstract

We consider Kernelized Bandits (KBs) to optimize a function $f : X \to [0, 1]$ belonging to the Reproducing Kernel Hilbert Space (RKHS) $H_{k}$ . Mainstream works on kernelized bandits focus on a subgaussian noise model in which observations of the form $f (x_{t}) + ϵ_{t}$ , being $ϵ_{t}$ a subgaussian noise, are available (Chowdhury and Gopalan, 2017). Differently, we focus on the case in which we observe realizations $y_{t} \sim Ber (f (x_{t}))$ sampled from a Bernoulli distribution with parameter $f (x_{t})$ . While the Bernoulli model has been investigated successfully in multi-armed bandits (Garivier and Capp\'e, 2011), logistic bandits (Faury et al., 2022), bandits in metric spaces (Magureanu et al., 2014), it remains an open question whether tight results can be obtained for KBs. This paper aims to draw the attention of the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research

MethodsSoftmax · Attention Is All You Need · Focus