CAESAR: Enhancing Federated RL in Heterogeneous MDPs through Convergence-Aware Sampling with Screening
Hei Yi Mak, Flint Xiaofeng Fan, Luca A. Lanzend\"orfer, Cheston Tan,, Wei Tsang Ooi, Roger Wattenhofer

TL;DR
This paper introduces CAESAR, a novel aggregation method for federated reinforcement learning that improves learning efficiency in heterogeneous environments by selectively combining agents based on convergence status.
Contribution
We propose CAESAR, a convergence-aware sampling and screening strategy that enhances federated RL performance across diverse MDPs by leveraging agents' convergence properties.
Findings
CAESAR outperforms traditional averaging in heterogeneous MDPs.
Empirical results show improved learning speed and efficiency.
Effective in environments like GridWorld and FrozenLake-v1.
Abstract
In this study, we delve into Federated Reinforcement Learning (FedRL) in the context of value-based agents operating across diverse Markov Decision Processes (MDPs). Existing FedRL methods typically aggregate agents' learning by averaging the value functions across them to improve their performance. However, this aggregation strategy is suboptimal in heterogeneous environments where agents converge to diverse optimal value functions. To address this problem, we introduce the Convergence-AwarE SAmpling with scReening (CAESAR) aggregation scheme designed to enhance the learning of individual agents across varied MDPs. CAESAR is an aggregation strategy used by the server that combines convergence-aware sampling with a screening mechanism. By exploiting the fact that agents learning in identical MDPs are converging to the same optimal value function, CAESAR enables the selective…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Packet Processing and Optimization · Advanced Data Storage Technologies · Parallel Computing and Optimization Techniques
