Federated Control in Markov Decision Processes
Hao Jin, Yang Peng, Liangyu Zhang, Zhihua Zhang

TL;DR
This paper introduces a federated learning approach for Markov Decision Processes, enabling multiple agents with limited capabilities to collaboratively learn optimal policies without sharing local experiences, and provides theoretical and experimental validation.
Contribution
It proposes the FedQ protocol for federated control in MDPs, addressing heterogeneity and communication challenges with theoretical guarantees and sample complexity analysis.
Findings
FedQ achieves linear speedup in sample complexity with uniform workload distribution.
Theoretical analysis confirms correctness and efficiency of FedQ and its variants.
Experiments demonstrate the practical effectiveness of the proposed methods.
Abstract
We study problems of federated control in Markov Decision Processes. To solve an MDP with large state space, multiple learning agents are introduced to collaboratively learn its optimal policy without communication of locally collected experience. In our settings, these agents have limited capabilities, which means they are restricted within different regions of the overall state space during the training process. In face of the difference among restricted regions, we firstly introduce concepts of leakage probabilities to understand how such heterogeneity affects the learning process, and then propose a novel communication protocol that we call Federated-Q protocol (FedQ), which periodically aggregates agents' knowledge of their restricted regions and accordingly modifies their learning problems for further training. In terms of theoretical analysis, we justify the correctness of FedQ…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed systems and fault tolerance · Data Quality and Management · Advanced Research in Systems and Signal Processing
