Fully Byzantine-Resilient Distributed Multi-Agent Q-Learning
Haejoon Lee, Dimitra Panagou

TL;DR
This paper introduces a Byzantine-resilient distributed Q-learning algorithm for multi-agent reinforcement learning, ensuring convergence to optimal value functions despite network attacks.
Contribution
A novel distributed Q-learning method with a redundancy-based filtering mechanism that guarantees convergence under Byzantine edge attacks.
Findings
The proposed algorithm converges to the optimal value functions despite Byzantine attacks.
A new topological condition for convergence is introduced and can be verified efficiently.
Simulations demonstrate the method's effectiveness over prior approaches.
Abstract
We study Byzantine-resilient distributed multi-agent reinforcement learning (MARL), where agents must collaboratively learn optimal value functions over a compromised communication network. Existing resilient MARL approaches typically guarantee almost sure convergence only to near-optimal value functions, or require restrictive assumptions to ensure convergence to optimal solution. As a result, agents may fail to learn the optimal policies under these methods. To address this, we propose a novel distributed Q-learning algorithm, under which all agents' value functions converge almost surely to the optimal value functions despite Byzantine edge attacks. The key idea is a redundancy-based filtering mechanism that leverages two-hop neighbor information to validate incoming messages, while preserving bidirectional information flow. We then introduce a new topological condition for the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
