Communication-Corruption Coupling and Verification in Cooperative Multi-Objective Bandits
Ming Shi

TL;DR
This paper investigates how communication and corruption levels affect learning in cooperative multi-armed bandits with adversarial noise, proposing bounds and protocols for effective collaboration under corruption constraints.
Contribution
It introduces a framework linking environment corruption to effective corruption in various sharing protocols, providing regret bounds and information-theoretic limits.
Findings
Raw-sample sharing amplifies corruption by a factor of N.
Summary and recommendation sharing maintain unamplified corruption levels.
Verification of observations is crucial in high-corruption regimes.
Abstract
We study cooperative stochastic multi-armed bandits with vector-valued rewards under adversarial corruption and limited verification. In each of rounds, each of agents selects an arm, the environment generates a clean reward vector, and an adversary perturbs the observed feedback subject to a global corruption budget . Performance is measured by team regret under a coordinate-wise nondecreasing, -Lipschitz scalarization , covering linear, Chebyshev, and smooth monotone utilities. Our main contribution is a communication-corruption coupling: we show that a fixed environment-side budget can translate into an effective corruption level ranging from to , depending on whether agents share raw samples, sufficient statistics, or only arm recommendations. We formalize this via a protocol-induced multiplicity functional and prove regret bounds…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Game Theory and Applications · Stochastic Gradient Optimization Techniques
