Cooperative Stochastic Multi-agent Multi-armed Bandits Robust to   Adversarial Corruptions

Junyan Liu; Shuai Li; Dapeng Li

arXiv:2106.04207·cs.LG·June 9, 2021·1 cites

Cooperative Stochastic Multi-agent Multi-armed Bandits Robust to Adversarial Corruptions

Junyan Liu, Shuai Li, Dapeng Li

PDF

Open Access

TL;DR

This paper introduces a new cooperative multi-agent algorithm for stochastic multi-armed bandits that is robust to adversarial corruptions, achieving near-optimal regret and efficient communication, and also addresses the single-agent case.

Contribution

The paper presents a corruption-agnostic algorithm for cooperative multi-agent bandits that attains near-optimal regret and extends to the single-agent scenario, resolving an open question.

Findings

01

Achieves near-optimal regret in stochastic setting

02

Maintains efficient communication among agents

03

Resolves an open problem in single-agent corruption setting

Abstract

We study the problem of stochastic bandits with adversarial corruptions in the cooperative multi-agent setting, where $V$ agents interact with a common $K$ -armed bandit problem, and each pair of agents can communicate with each other to expedite the learning process. In the problem, the rewards are independently sampled from distributions across all agents and rounds, but they may be corrupted by an adversary. Our goal is to minimize both the overall regret and communication cost across all agents. We first show that an additive term of corruption is unavoidable for any algorithm in this problem. Then, we propose a new algorithm that is agnostic to the level of corruption. Our algorithm not only achieves near-optimal regret in the stochastic setting, but also obtains a regret with an additive term of corruption in the corrupted setting, while maintaining efficient communication. The…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Data Stream Mining Techniques · Reinforcement Learning in Robotics