FedGRPO: Privately Optimizing Foundation Models with Group-Relative Rewards from Domain Client

Gongxi Zhu; Hanlin Gu; Lixin Fan; Qiang Yang; Yuxing Han

arXiv:2602.12014·cs.LG·February 13, 2026

FedGRPO: Privately Optimizing Foundation Models with Group-Relative Rewards from Domain Client

Gongxi Zhu, Hanlin Gu, Lixin Fan, Qiang Yang, Yuxing Han

PDF

Open Access

TL;DR

FedGRPO introduces a privacy-preserving federated learning framework that uses group-relative rewards and expert selection to improve foundation model performance efficiently across diverse domains.

Contribution

It reformulates federated foundation model training as a reinforcement learning process, reducing privacy risks and communication costs while enhancing accuracy.

Findings

01

Outperforms baseline methods in accuracy across multiple domain tasks.

02

Reduces communication overhead compared to traditional FedFMs.

03

Maintains privacy by exchanging reward signals instead of data or model updates.

Abstract

One important direction of Federated Foundation Models (FedFMs) is leveraging data from small client models to enhance the performance of a large server-side foundation model. Existing methods based on model level or representation level knowledge transfer either require expensive local training or incur high communication costs and introduce unavoidable privacy risks. We reformulate this problem as a reinforcement learning style evaluation process and propose FedGRPO, a privacy preserving framework comprising two modules. The first module performs competence-based expert selection by building a lightweight confidence graph from auxiliary data to identify the most suitable clients for each question. The second module leverages the "Group Relative" concept from the Group Relative Policy Optimization (GRPO) framework by packaging each question together with its solution rationale into…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Recommender Systems and Techniques · Domain Adaptation and Few-Shot Learning