The Pareto Frontier of Instance-Dependent Guarantees in Multi-Player   Multi-Armed Bandits with no Communication

Allen Liu; Mark Sellke

arXiv:2202.09653·cs.LG·June 8, 2022

The Pareto Frontier of Instance-Dependent Guarantees in Multi-Player Multi-Armed Bandits with no Communication

Allen Liu, Mark Sellke

PDF

Open Access

TL;DR

This paper characterizes the fundamental trade-offs in achieving optimal instance-dependent regret in multi-player multi-armed bandits without communication, revealing limitations and proposing a generalized algorithm resilient to adversarial feedback.

Contribution

It provides a complete Pareto frontier of achievable regret guarantees without communication and introduces a new topological lower bound technique.

Findings

01

Optimal regret guarantees are impossible for some gaps without communication.

02

The paper characterizes all Pareto optimal trade-offs in regret.

03

A generalized algorithm achieves these trade-offs even with adversarial feedback.

Abstract

We study the stochastic multi-player multi-armed bandit problem. In this problem, $m$ players cooperate to maximize their total reward from $K > m$ arms. However the players cannot communicate and are penalized (e.g. receive no reward) if they pull the same arm at the same time. We ask whether it is possible to obtain optimal instance-dependent regret $\tilde{O} (1/Δ)$ where $Δ$ is the gap between the $m$ -th and $m + 1$ -st best arms. Such guarantees were recently achieved in a model allowing the players to implicitly communicate through intentional collisions. Surprisingly, we show that with no communication at all, such guarantees are not achievable. In fact, obtaining the optimal $\tilde{O} (1/Δ)$ regret for some values of $Δ$ necessarily implies strictly sub-optimal regret in other regimes. Our main result is a complete characterization of the Pareto optimal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Auction Theory and Applications · Game Theory and Applications