A Near-optimal, Scalable and Parallelizable Framework for Stochastic Bandits Robust to Adversarial Corruptions and Beyond

Zicheng Hu; Cheng Chen

arXiv:2502.07514·cs.LG·January 5, 2026

A Near-optimal, Scalable and Parallelizable Framework for Stochastic Bandits Robust to Adversarial Corruptions and Beyond

Zicheng Hu, Cheng Chen

PDF

Open Access 1 Video

TL;DR

This paper introduces BARBAT, a scalable and parallelizable framework for stochastic bandits that is robust against adversarial corruptions, achieving near-optimal regret bounds and extending to multiple complex settings.

Contribution

We propose BARBAT, a novel framework that improves regret bounds of previous algorithms and extends to various complex bandit settings with better scalability and parallelization.

Findings

01

Achieves near-optimal regret bounds up to a logarithmic factor.

02

Extends to multi-agent, graph, combinatorial semi-bandits, and batched bandits.

03

Demonstrates efficiency through numerical experiments.

Abstract

We investigate various stochastic bandit problems in the presence of adversarial corruptions. A seminal work for this problem is the BARBAR~\cite{gupta2019better} algorithm, which achieves both robustness and efficiency. However, it suffers from a regret of $O (K C)$ , which does not match the lower bound of $Ω (C)$ , where $K$ denotes the number of arms and $C$ denotes the corruption level. In this paper, we first improve the BARBAR algorithm by proposing a novel framework called BARBAT, which eliminates the factor of $K$ to achieve an optimal regret bound up to a logarithmic factor. We also extend BARBAT to various settings, including multi-agent bandits, graph bandits, combinatorial semi-bandits and batched bandits. Compared with the Follow-the-Regularized-Leader framework, our methods are more amenable to parallelization, making them suitable for multi-agent and batched bandit…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

A Near-optimal, Scalable and Parallelizable Framework for Stochastic Bandits Robust to Adversarial Corruptions and Beyond· slideslive

Taxonomy

TopicsAuction Theory and Applications · Advanced Bandit Algorithms Research · Blockchain Technology Applications and Security