Approximated Behavioral Metric-based State Projection for Federated Reinforcement Learning

Zengxia Guo; Bohui An; Zhongqi Lu

arXiv:2505.09959·cs.LG·May 16, 2025

Approximated Behavioral Metric-based State Projection for Federated Reinforcement Learning

Zengxia Guo, Bohui An, Zhongqi Lu

PDF

Open Access 3 Reviews

TL;DR

This paper introduces FedRAG, a federated reinforcement learning framework that uses approximated behavior metric-based state projection functions to improve performance while preserving privacy, validated through experiments on DeepMind Control Suite.

Contribution

The paper proposes a novel state projection method in federated RL that enhances performance and privacy, with a practical learning framework called FedRAG.

Findings

01

Improved performance in federated RL tasks.

02

Effective privacy preservation of sensitive information.

03

Validated results on DeepMind Control Suite.

Abstract

Federated reinforcement learning (FRL) methods usually share the encrypted local state or policy information and help each client to learn from others while preserving everyone's privacy. In this work, we propose that sharing the approximated behavior metric-based state projection function is a promising way to enhance the performance of FRL and concurrently provides an effective protection of sensitive information. We introduce FedRAG, a FRL framework to learn a computationally practical projection function of states for each client and aggregating the parameters of projection functions at a central server. The FedRAG approach shares no sensitive task-specific information, yet provides information gain for each client. We conduct extensive experiments on the DeepMind Control Suite to demonstrate insightful results.

Peer Reviews

Decision·Submitted to ICLR 2025

Reviewer 01Rating 5Confidence 2

Strengths

- Proposed a new algorithm that enhance the performance of FRL. - The experiment show the effectiveness of the algorithm.

Weaknesses

- It would be better to have a problem formulation section with assumptions using in the algorithm. - Section 3 could be improved, for example, it would be better to explain what global critic Q network will be used in the algorithm with a few more sentences. - The presentation of the experiments could be better, for example, for each of the experiment in section 5.4, explain why algorithm proposed is better/similar/worse than the baselines. The plot can be a bit bigger as well.

Reviewer 02Rating 6Confidence 4

Strengths

S1. The proposed FedRAG technique is interesting, appears sound and (mostly) attains favorable results over the baselines. S2. The paper is generally well-written and organised.

Weaknesses

W1. The claims of the technique providing "effective protection of sensitive information" are tenuous and should be proven or otherwise removed. Even though the method does not share e.g. the raw states, in my opinion it is possible to devise a fairly straightforward attack to recover them. An attacker can use the exposed mapping function to find the representation of a given set of states, similar to the application of a hash function. If the state space is sufficiently small (or reasonable gue

Reviewer 03Rating 3Confidence 4

Strengths

+The approach of sharing behavioral metric-based state projections in a federated reinforcement learning (FRL) setting is an innovative idea. +Addressing privacy in FRL through state projections based on behavioral metrics could make a significant contribution, given the growing concerns around data privacy in federated setups.

Weaknesses

After a close review, I am inclined to recommend rejection due to key issues in theoretical rigor, completeness of related work, and empirical validation. 1. Novelty and Related Work: The paper overlooks critical related work in FedRL, especially recent developments that are highly relevant to FedRAG’s contributions. Notably, Fan et al. [1] formulates the objectives of FedRL under homogeneous environments. How does FedRAG’s objective differ from that? Are you working for the same objective bu

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTransportation and Mobility Innovations · Traffic control and management · Reinforcement Learning in Robotics