Heterogeneous Multi-player Multi-armed Bandits: Closing the Gap and Generalization
Chengshuai Shi, Wei Xiong, Cong Shen, Jing Yang

TL;DR
This paper introduces BEACON, a novel algorithm for heterogeneous multi-player multi-armed bandits that closes the regret gap to the centralized lower bound through implicit communication and efficient exploration, and extends to nonlinear reward functions.
Contribution
BEACON combines implicit communication and batched exploration to achieve near-optimal regret in heterogeneous MP-MAB and generalizes to nonlinear reward functions.
Findings
BEACON achieves logarithmic regret in heterogeneous MP-MAB.
The adaptive differential communication improves implicit communication efficiency.
The approach bridges combinatorial and multi-player MAB research areas.
Abstract
Despite the significant interests and many progresses in decentralized multi-player multi-armed bandits (MP-MAB) problems in recent years, the regret gap to the natural centralized lower bound in the heterogeneous MP-MAB setting remains open. In this paper, we propose BEACON -- Batched Exploration with Adaptive COmmunicatioN -- that closes this gap. BEACON accomplishes this goal with novel contributions in implicit communication and efficient exploration. For the former, we propose a novel adaptive differential communication (ADC) design that significantly improves the implicit communication efficiency. For the latter, a carefully crafted batched exploration scheme is developed to enable incorporation of the combinatorial upper confidence bound (CUCB) principle. We then generalize the existing linear-reward MP-MAB problems, where the system reward is always the sum of individually…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Cognitive Radio Networks and Spectrum Sensing · Smart Grid Energy Management
