Multi-Armed Bandits with Correlated Arms

Samarth Gupta; Shreyas Chaudhari; Gauri Joshi; Osman Ya\u{g}an

arXiv:1911.03959·stat.ML·September 13, 2021

Multi-Armed Bandits with Correlated Arms

Samarth Gupta, Shreyas Chaudhari, Gauri Joshi, Osman Ya\u{g}an

PDF

2 Repos

TL;DR

This paper introduces a new approach for multi-armed bandit problems with correlated rewards, improving efficiency and regret bounds by leveraging reward correlations, and validates the method with real-world datasets.

Contribution

It generalizes classic bandit algorithms to handle correlated rewards and provides a unified analysis framework, achieving order-optimal regret under certain correlation models.

Findings

01

C-UCB pulls non-competitive arms only O(1) times

02

Algorithms outperform classical bandit algorithms on MovieLens and Goodreads datasets

03

Achieves order-optimal regret when arms are correlated through a latent source

Abstract

We consider a multi-armed bandit framework where the rewards obtained by pulling different arms are correlated. We develop a unified approach to leverage these reward correlations and present fundamental generalizations of classic bandit algorithms to the correlated setting. We present a unified proof technique to analyze the proposed algorithms. Rigorous analysis of C-UCB (the correlated bandit version of Upper-confidence-bound) reveals that the algorithm ends up pulling certain sub-optimal arms, termed as non-competitive, only O(1) times, as opposed to the O(log T) pulls required by classic bandit algorithms such as UCB, TS etc. We present regret-lower bound and show that when arms are correlated through a latent random source, our algorithms obtain order-optimal regret. We validate the proposed algorithms via experiments on the MovieLens and Goodreads datasets, and show significant…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSpatio-temporal stability analysis