Decentralized Federated Policy Gradient with Byzantine Fault-Tolerance and Provably Fast Convergence
Philip Jordan, Florian Gr\"otschla, Flint Xiaofeng Fan, Roger, Wattenhofer

TL;DR
This paper introduces the first decentralized Byzantine fault-tolerant federated reinforcement learning method, combining robust aggregation and agreement techniques, with proven convergence and empirical validation demonstrating improved speed and resilience.
Contribution
It presents a novel decentralized Byzantine fault-tolerant policy gradient algorithm with theoretical analysis and experimental validation, eliminating the need for a trusted central aggregator.
Findings
Achieves Byzantine fault-tolerance in decentralized federated RL
Demonstrates faster convergence with more agents
Shows robustness against various Byzantine attacks
Abstract
In Federated Reinforcement Learning (FRL), agents aim to collaboratively learn a common task, while each agent is acting in its local environment without exchanging raw trajectories. Existing approaches for FRL either (a) do not provide any fault-tolerance guarantees (against misbehaving agents), or (b) rely on a trusted central agent (a single point of failure) for aggregating updates. We provide the first decentralized Byzantine fault-tolerant FRL method. Towards this end, we first propose a new centralized Byzantine fault-tolerant policy gradient (PG) algorithm that improves over existing methods by relying only on assumptions standard for non-fault-tolerant PG. Then, as our main contribution, we show how a combination of robust aggregation and Byzantine-resilient agreement methods can be leveraged in order to eliminate the need for a trusted central entity. Since our results…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data
