Byzantine-Resilient Decentralized Multi-Armed Bandits

Jingxuan Zhu; Alec Koppel; Alvaro Velasquez; Ji Liu

arXiv:2310.07320·cs.LG·June 12, 2025·1 cites

Byzantine-Resilient Decentralized Multi-Armed Bandits

Jingxuan Zhu, Alec Koppel, Alvaro Velasquez, Ji Liu

PDF

Open Access

TL;DR

This paper introduces a decentralized resilient UCB algorithm for multi-armed bandits that maintains performance despite Byzantine agents, improving collective regret in adversarial environments through information fusion and truncation.

Contribution

It develops a fully decentralized resilient UCB algorithm that handles Byzantine agents, ensuring normal agents' regret matches single-agent performance and improves collective regret.

Findings

01

Normal agents' regret is no worse than single-agent UCB1.

02

Collective regret is strictly better with sufficient neighbors.

03

Algorithm performs well in experiments under adversarial conditions.

Abstract

In decentralized cooperative multi-armed bandits (MAB), each agent observes a distinct stream of rewards, and seeks to exchange information with others to select a sequence of arms so as to minimize its regret. Agents in the cooperative setting can outperform a single agent running a MAB method such as Upper-Confidence Bound (UCB) independently. In this work, we study how to recover such salient behavior when an unknown fraction of the agents can be Byzantine, that is, communicate arbitrarily wrong information in the form of reward mean-estimates or confidence sets. This framework can be used to model attackers in computer networks, instigators of offensive content into recommender systems, or manipulators of financial markets. Our key contribution is the development of a fully decentralized resilient upper confidence bound (UCB) algorithm that fuses an information mixing step among…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Blockchain Technology Applications and Security · Age of Information Optimization