Meet Dynamic Individual Preferences: Resolving Conflicting Human Value with Paired Fine-Tuning
Shanyong Wang, Shuhang Lin, Yining Zhao, Xi Zhu, Yongfeng Zhang

TL;DR
This paper introduces Preference-Paired Fine-Tuning (PFT), a novel method to align large language models with dynamic, conflicting individual preferences, outperforming existing methods in accuracy and adaptation speed.
Contribution
The paper proposes PFT, a new fine-tuning framework for adapting LLMs to conflicting and evolving personal preferences, supported by a new dataset and comprehensive experiments.
Findings
PFT achieves up to 96.6% accuracy in multi-choice classification.
PFT attains the highest open-ended generation score of 8.69.
Models with limited user data improve preference alignment by 44.76%.
Abstract
Recent advances in large language models (LLMs) have significantly improved the alignment of models with general human preferences. However, a major challenge remains in adapting LLMs to individual preferences, which are not only diverse but also dynamic. In this paper, we introduce a novel framework, Preference-Paired Fine-Tuning (PFT), designed to align models with contradictory and evolving individual preferences. We present a new dataset, Value Conflict Dilemma (VCD), which includes scenarios that involve conflicting human preferences, facilitating the evaluation of our approach. Our experiments demonstrate that PFT outperforms single-preference training methods, achieving up to 96.6% accuracy in multi-choice classification tasks and the highest open-ended generation score of 8.69. PFT also shows significant improvements over DPO, SFT and some traditional training methods,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
