FedPDPO: Federated Personalized Direct Preference Optimization for Large Language Model Alignment
Kewen Zhu, Liping Yi, Zhiming Zhao, Zhuang Qi, Han Yu, Qinghua Hu

TL;DR
This paper introduces FedPDPO, a federated learning framework that personalizes large language model alignment with human preferences, effectively handling non-IID data and improving performance over existing methods.
Contribution
FedPDPO proposes a novel personalized federated preference optimization framework with parameter-efficient fine-tuning, addressing non-IID data challenges and enhancing LLM alignment with human preferences.
Findings
Achieves up to 4.80% accuracy improvement in federated settings.
Effectively handles non-IID preference data.
Demonstrates state-of-the-art performance on multiple datasets.
Abstract
Aligning large language models (LLMs) with human preferences in federated learning (FL) is challenging due to decentralized, privacy-sensitive, and highly non-IID preference data. Direct Preference Optimization (DPO) offers an efficient alternative to reinforcement learning with human feedback (RLHF), but its direct application in FL suffers from severe performance degradation under non-IID data and limited generalization of implicit rewards. To bridge this gap, we propose FedPDPO (Federated Personalized Direct Preference Optimization), a personalized federated framework for preference alignment of LLMs. It adopts a parameter-efficient fine-tuning architecture where each client maintains a frozen pretrained LLM backbone augmented with a Low-Rank Adaptation (LoRA) adapter, enabling communication-efficient aggregation. To address non-IID heterogeneity, we devise (1) the globally shared…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Recommender Systems and Techniques · Advanced Graph Neural Networks
