Robust Multi-Agent Bandits Over Undirected Graphs
Daniel Vial, Sanjay Shakkottai, R. Srikant

TL;DR
This paper studies multi-agent bandit algorithms over networks with malicious agents, showing that existing methods fail on line graphs and proposing a new algorithm with regret bounds that depend on local malicious neighbors.
Contribution
The paper introduces a new algorithm for multi-agent bandits over arbitrary connected graphs, with regret bounds depending on local malicious neighbors, extending prior results beyond complete graphs.
Findings
Existing algorithms suffer nearly linear regret on line graphs.
The proposed algorithm achieves regret depending on local malicious neighbors.
Regret bounds are generalized to any connected undirected graph.
Abstract
We consider a multi-agent multi-armed bandit setting in which honest agents collaborate over a network to minimize regret but malicious agents can disrupt learning arbitrarily. Assuming the network is the complete graph, existing algorithms incur regret in this setting, where is the number of arms and is the arm gap. For , this improves over the single-agent baseline regret of . In this work, we show the situation is murkier beyond the case of a complete graph. In particular, we prove that if the state-of-the-art algorithm is used on the undirected line graph, honest agents can suffer (nearly) linear regret until time is doubly exponential in and . In light of this negative result, we propose a new algorithm for which the -th agent has regret …
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Optimization and Search Problems · Misinformation and Its Impacts
