FedHPD: Heterogeneous Federated Reinforcement Learning via Policy   Distillation

Wenzheng Jiang; Ji Wang; Xiongtao Zhang; Weidong Bao; Cheston Tan,; Flint Xiaofeng Fan

arXiv:2502.00870·cs.LG·February 4, 2025

FedHPD: Heterogeneous Federated Reinforcement Learning via Policy Distillation

Wenzheng Jiang, Ji Wang, Xiongtao Zhang, Weidong Bao, Cheston Tan,, Flint Xiaofeng Fan

PDF

Open Access 1 Repo

TL;DR

FedHPD introduces a novel federated reinforcement learning framework that enables heterogeneous agents to share knowledge effectively through policy distillation, overcoming limitations of existing methods and enhancing performance in diverse tasks.

Contribution

This paper proposes FedHPD, a new method for heterogeneous federated RL using action probability distillation, with theoretical convergence analysis and practical effectiveness demonstrated.

Findings

01

FedHPD outperforms existing methods on multiple RL benchmarks.

02

It effectively enables knowledge sharing among heterogeneous agents.

03

The method operates well without requiring extensive public datasets.

Abstract

Federated Reinforcement Learning (FedRL) improves sample efficiency while preserving privacy; however, most existing studies assume homogeneous agents, limiting its applicability in real-world scenarios. This paper investigates FedRL in black-box settings with heterogeneous agents, where each agent employs distinct policy networks and training configurations without disclosing their internal details. Knowledge Distillation (KD) is a promising method for facilitating knowledge sharing among heterogeneous models, but it faces challenges related to the scarcity of public datasets and limitations in knowledge representation when applied to FedRL. To address these challenges, we propose Federated Heterogeneous Policy Distillation (FedHPD), which solves the problem of heterogeneous FedRL by utilizing action probability distributions as a medium for knowledge sharing. We provide a theoretical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

winzhengkong/fedhpd
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Traffic Prediction and Management Techniques · Age of Information Optimization

MethodsKnowledge Distillation