Near-Optimal Regret for Distributed Adversarial Bandits: A Black-Box Approach

Hao Qiu; Mengxiao Zhang; Nicol\`o Cesa-Bianchi

arXiv:2602.06404·cs.LG·February 9, 2026

Near-Optimal Regret for Distributed Adversarial Bandits: A Black-Box Approach

Hao Qiu, Mengxiao Zhang, Nicol\`o Cesa-Bianchi

PDF

Open Access

TL;DR

This paper introduces a near-optimal algorithm for distributed adversarial bandits that leverages a black-box reduction to handle delayed feedback, significantly improving regret bounds and extending to linear bandits.

Contribution

The paper presents a novel black-box reduction approach for distributed adversarial bandits, achieving near-optimal regret bounds and extending to linear bandits with minimal communication.

Findings

01

Achieves minimax regret of rac12;( ho^{-1/2}+K/N)T

02

Introduces a gossip-based algorithm with improved regret bounds over previous work

03

Extends framework to distributed linear bandits with low communication cost

Abstract

We study distributed adversarial bandits, where $N$ agents cooperate to minimize the global average loss while observing only their own local losses. We show that the minimax regret for this problem is $\tilde{Θ} ((ρ^{- 1/2} + K / N) T)$ , where $T$ is the horizon, $K$ is the number of actions, and $ρ$ is the spectral gap of the communication matrix. Our algorithm, based on a novel black-box reduction to bandits with delayed feedback, requires agents to communicate only through gossip. It achieves an upper bound that significantly improves over the previous best bound $\tilde{O} (ρ^{- 1/3} (K T)^{2/3})$ of Yi and Vojnovic (2023). We complement this result with a matching lower bound, showing that the problem's difficulty decomposes into a communication cost $ρ^{- 1/4} T$ and a bandit cost $K T / N$ . We further demonstrate the versatility of our approach by deriving…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Stochastic Gradient Optimization Techniques · Mobile Crowdsensing and Crowdsourcing