On the Linear Speedup of Personalized Federated Reinforcement Learning with Shared Representations

Guojun Xiong; Shufan Wang; Daniel Jiang; Jian Li

arXiv:2411.15014·cs.LG·July 18, 2025

On the Linear Speedup of Personalized Federated Reinforcement Learning with Shared Representations

Guojun Xiong, Shufan Wang, Daniel Jiang, Jian Li

PDF

Open Access 3 Reviews

TL;DR

This paper introduces a personalized federated reinforcement learning framework that leverages shared representations among heterogeneous agents, proving linear convergence speedup and demonstrating improved learning and generalization in experiments.

Contribution

The paper proposes PFedRL, a novel personalized federated RL framework with shared representations, and proves the first linear convergence speedup with respect to the number of agents.

Findings

01

PFedTD-Rep achieves linear convergence speedup with more agents.

02

Experimental results show improved learning in heterogeneous environments.

03

PFedTD-Rep generalizes better to new environments.

Abstract

Federated reinforcement learning (FedRL) enables multiple agents to collaboratively learn a policy without sharing their local trajectories collected during agent-environment interactions. However, in practice, the environments faced by different agents are often heterogeneous, leading to poor performance by the single policy learned by existing FedRL algorithms on individual agents. In this paper, we take a further step and introduce a \emph{personalized} FedRL framework (PFedRL) by taking advantage of possibly shared common structure among agents in heterogeneous environments. Specifically, we develop a class of PFedRL algorithms named PFedRL-Rep that learns (1) a shared feature representation collaboratively among all agents, and (2) an agent-specific weight vector personalized to its local environment. We analyze the convergence of PFedTD-Rep, a particular instance of the framework…

Peer Reviews

Decision·ICLR 2025 Poster

Reviewer 01Rating 6Confidence 2

Strengths

- The theoretical analysis have a proof sketch and the assumptions are clearly stated. - I think the two timescale approximation result is novel - Interesting results from cliff-walking and cartpole

Weaknesses

- Although the setup hold promise, evaluation is rather on the simple side. I consider it understandable for now since the main focus in on theory. - If I understand correctly, the linear speed up is not a particular exciting result, since sample collection in unit time grows linearly with N. - There is no comparison with other PFL methods, or parameter-sharing MARL methods.

Reviewer 02Rating 8Confidence 3

Strengths

1. PFEDRL-REP is an innovative approach to FedRL, addressing a major challenge in heterogeneous environments by introducing a shared representation while allowing for agent-level personalization. 2. The paper provides a rigorous theoretical foundation, including proofs of convergence speedup under Markovian noise using a two-timescale stochastic approximation framework. 3. The paper is well-structured and clear, with detailed explanations of the problem formulation, the PFEDRL-REP framework, an

Weaknesses

1. The experimental evaluation could be extended to include more complex environments, such as those with sparse rewards or high-dimensional state spaces, to better assess the scalability of PFEDRL-REP. 2. The applicability of PFEDRL-REP to all types of environmental heterogeneity is not fully guaranteed, as the combination of shared feature representations and personalized weight vectors may not capture all nuances of diverse environments.

Reviewer 03Rating 6Confidence 4

Strengths

+The paper is practically relevant and addresses real-world heterogeneity in federated RL environments +The manuscript made some innovations in FedRL theories: - First work to prove linear speedup in personalized federated RL with shared representations under Markovian noise - Rigorous analysis of convergence rates using two-timescale stochastic approximation theory

Weaknesses

while there are merits in the paper’s theoretical contributions, I am concerned with a few critical points: 1. regarding the motivation on personalization: a. the paper lacks formal definition of what constitutes successful personalization. The authors should consider to design metrics to quantify personalization quality b. thus, no theoretical guarantees that learned personalization (via agent-specific parameters in the paper) captures meaningful environment-specific adaptations. What happen

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Transportation and Mobility Innovations