Orthogonal Finetuning for Direct Preference Optimization
Chenxu Yang, Ruipeng Jia, Naibin Gu, Zheng Lin, Siyuan Chen, Chao Pang, Weichong Yin, Yu Sun, Hua Wu, Weiping Wang

TL;DR
This paper introduces orthogonal finetuning via weight rotation for DPO, effectively reducing overfitting and improving alignment with human preferences while maintaining model diversity and using minimal additional parameters.
Contribution
It proposes a novel orthogonal finetuning method that preserves hyperspherical energy during preference optimization, enhancing alignment and diversity with minimal parameter updates.
Findings
Outperforms DPO on MT-Bench and AlpacaEval 2 benchmarks.
Reduces overfitting and maintains model diversity.
Uses only 0.0086% of trainable parameters.
Abstract
DPO is an effective preference optimization algorithm. However, the DPO-tuned models tend to overfit on the dispreferred samples, manifested as overly long generations lacking diversity. While recent regularization approaches have endeavored to alleviate this issue by modifying the objective function, they achieved that at the cost of alignment performance degradation. In this paper, we innovatively incorporate regularization from the perspective of weight updating to curb alignment overfitting. Through the pilot experiment, we discovered that there exists a positive correlation between overfitting and the hyperspherical energy fluctuation. Hence, we introduce orthogonal finetuning for DPO via a weight-Rotated Preference Optimization (RoPO) method, which merely conducts rotational and magnitude-stretching updates on the weight parameters to maintain the hyperspherical energy invariant,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsColor perception and design · Advanced Multi-Objective Optimization Algorithms · Optimization and Packing Problems
MethodsDirect Preference Optimization
