Correlated Multi-armed Bandits with a Latent Random Source

Samarth Gupta; Gauri Joshi; Osman Ya\u{g}an

arXiv:1808.05904·stat.ML·January 31, 2019

Correlated Multi-armed Bandits with a Latent Random Source

Samarth Gupta, Gauri Joshi, Osman Ya\u{g}an

PDF

2 Repos

TL;DR

This paper introduces a new multi-armed bandit framework leveraging a shared latent random source to identify and eliminate non-competitive arms, significantly reducing regret and improving efficiency.

Contribution

It proposes a generalized UCB algorithm that exploits arm correlations via a latent source, reducing the problem complexity and achieving constant regret in certain regimes.

Findings

01

The algorithm reduces exploration of non-competitive arms to constant times.

02

In some regimes, the algorithm attains constant regret, outperforming traditional logarithmic regret.

03

Theoretical analysis confirms the algorithm's optimality in certain settings.

Abstract

We consider a novel multi-armed bandit framework where the rewards obtained by pulling the arms are functions of a common latent random variable. The correlation between arms due to the common random source can be used to design a generalized upper-confidence-bound (UCB) algorithm that identifies certain arms as $n o n - co m p e t i t i v e$ , and avoids exploring them. As a result, we reduce a $K$ -armed bandit problem to a $C + 1$ -armed problem, where $C + 1$ includes the best arm and $C$ $co m p e t i t i v e$ arms. Our regret analysis shows that the competitive arms need to be pulled $O (lo g T)$ times, while the non-competitive arms are pulled only $O (1)$ times. As a result, there are regimes where our algorithm achieves a $O (1)$ regret as opposed to the typical logarithmic regret scaling of multi-armed bandit algorithms. We also evaluate lower bounds on the expected regret and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.