FedRLHF: A Convergence-Guaranteed Federated Framework for   Privacy-Preserving and Personalized RLHF

Flint Xiaofeng Fan; Cheston Tan; Yew-Soon Ong; Roger Wattenhofer and; Wei-Tsang Ooi

arXiv:2412.15538·cs.LG·February 11, 2025

FedRLHF: A Convergence-Guaranteed Federated Framework for Privacy-Preserving and Personalized RLHF

Flint Xiaofeng Fan, Cheston Tan, Yew-Soon Ong, Roger Wattenhofer and, Wei-Tsang Ooi

PDF

Open Access 1 Repo

TL;DR

FedRLHF introduces a decentralized federated framework for reinforcement learning with human feedback, ensuring privacy, personalization, and convergence guarantees, while maintaining performance comparable to centralized methods.

Contribution

It presents the first federated RLHF framework with proven convergence and privacy preservation, enabling collaborative, personalized policy learning without sharing raw data.

Findings

01

Achieves privacy-preserving RLHF with performance comparable to centralized methods.

02

Provides theoretical convergence guarantees and sample complexity bounds.

03

Demonstrates effective personalization and scalability on real datasets.

Abstract

In the era of increasing privacy concerns and demand for personalized experiences, traditional Reinforcement Learning with Human Feedback (RLHF) frameworks face significant challenges due to their reliance on centralized data. We introduce Federated Reinforcement Learning with Human Feedback (FedRLHF), a novel framework that decentralizes the RLHF process. FedRLHF enables collaborative policy learning across multiple clients without necessitating the sharing of raw data or human feedback, thereby ensuring robust privacy preservation. Leveraging federated reinforcement learning, each client integrates human feedback locally into their reward functions and updates their policies through personalized RLHF processes. We establish rigorous theoretical foundations for FedRLHF, providing convergence guarantees, and deriving sample complexity bounds that scale efficiently with the number of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

flint-xf-fan/Byzantine-Federeated-RL
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCryptography and Data Security · Privacy-Preserving Technologies in Data · Access Control and Trust