Near-Optimal Regret for Distributed Adversarial Bandits: A Black-Box Approach
Hao Qiu, Mengxiao Zhang, Nicol\`o Cesa-Bianchi

TL;DR
This paper introduces a near-optimal algorithm for distributed adversarial bandits that leverages a black-box reduction to handle delayed feedback, significantly improving regret bounds and extending to linear bandits.
Contribution
The paper presents a novel black-box reduction approach for distributed adversarial bandits, achieving near-optimal regret bounds and extending to linear bandits with minimal communication.
Findings
Achieves minimax regret of rac12;( ho^{-1/2}+K/N)T
Introduces a gossip-based algorithm with improved regret bounds over previous work
Extends framework to distributed linear bandits with low communication cost
Abstract
We study distributed adversarial bandits, where agents cooperate to minimize the global average loss while observing only their own local losses. We show that the minimax regret for this problem is , where is the horizon, is the number of actions, and is the spectral gap of the communication matrix. Our algorithm, based on a novel black-box reduction to bandits with delayed feedback, requires agents to communicate only through gossip. It achieves an upper bound that significantly improves over the previous best bound of Yi and Vojnovic (2023). We complement this result with a matching lower bound, showing that the problem's difficulty decomposes into a communication cost and a bandit cost . We further demonstrate the versatility of our approach by deriving…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Stochastic Gradient Optimization Techniques · Mobile Crowdsensing and Crowdsourcing
