Collaborative Learning with Limited Interaction: Tight Bounds for Distributed Exploration in Multi-Armed Bandits
Chao Tao, Qin Zhang, Yuan Zhou

TL;DR
This paper investigates the limits of collaborative learning in multi-armed bandits with multiple agents, focusing on how limited communication affects the speedup over centralized approaches, and establishes tight bounds for this distributed exploration problem.
Contribution
It provides nearly optimal bounds on the number of communication rounds needed for distributed best arm identification, introducing new techniques for lower bound proofs.
Findings
Almost tight round-speedup tradeoffs are established.
New techniques for lower bounds on communication steps are developed.
Quantifies the impact of limited interaction on distributed exploration efficiency.
Abstract
Best arm identification (or, pure exploration) in multi-armed bandits is a fundamental problem in machine learning. In this paper we study the distributed version of this problem where we have multiple agents, and they want to learn the best arm collaboratively. We want to quantify the power of collaboration under limited interaction (or, communication steps), as interaction is expensive in many settings. We measure the running time of a distributed algorithm as the speedup over the best centralized algorithm where there is only one agent. We give almost tight round-speedup tradeoffs for this problem, along which we develop several new techniques for proving lower bounds on the number of communication steps under time or confidence constraints.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Optimization and Search Problems
