FedPDPO: Federated Personalized Direct Preference Optimization for Large Language Model Alignment

Kewen Zhu; Liping Yi; Zhiming Zhao; Zhuang Qi; Han Yu; Qinghua Hu

arXiv:2603.19741·cs.LG·March 23, 2026

FedPDPO: Federated Personalized Direct Preference Optimization for Large Language Model Alignment

Kewen Zhu, Liping Yi, Zhiming Zhao, Zhuang Qi, Han Yu, Qinghua Hu

PDF

Open Access

TL;DR

This paper introduces FedPDPO, a federated learning framework that personalizes large language model alignment with human preferences, effectively handling non-IID data and improving performance over existing methods.

Contribution

FedPDPO proposes a novel personalized federated preference optimization framework with parameter-efficient fine-tuning, addressing non-IID data challenges and enhancing LLM alignment with human preferences.

Findings

01

Achieves up to 4.80% accuracy improvement in federated settings.

02

Effectively handles non-IID preference data.

03

Demonstrates state-of-the-art performance on multiple datasets.

Abstract

Aligning large language models (LLMs) with human preferences in federated learning (FL) is challenging due to decentralized, privacy-sensitive, and highly non-IID preference data. Direct Preference Optimization (DPO) offers an efficient alternative to reinforcement learning with human feedback (RLHF), but its direct application in FL suffers from severe performance degradation under non-IID data and limited generalization of implicit rewards. To bridge this gap, we propose FedPDPO (Federated Personalized Direct Preference Optimization), a personalized federated framework for preference alignment of LLMs. It adopts a parameter-efficient fine-tuning architecture where each client maintains a frozen pretrained LLM backbone augmented with a Low-Rank Adaptation (LoRA) adapter, enabling communication-efficient aggregation. To address non-IID heterogeneity, we devise (1) the globally shared…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Recommender Systems and Techniques · Advanced Graph Neural Networks