Orthogonal Finetuning for Direct Preference Optimization

Chenxu Yang; Ruipeng Jia; Naibin Gu; Zheng Lin; Siyuan Chen; Chao Pang; Weichong Yin; Yu Sun; Hua Wu; Weiping Wang

arXiv:2409.14836·cs.CL·August 26, 2025

Orthogonal Finetuning for Direct Preference Optimization

Chenxu Yang, Ruipeng Jia, Naibin Gu, Zheng Lin, Siyuan Chen, Chao Pang, Weichong Yin, Yu Sun, Hua Wu, Weiping Wang

PDF

Open Access

TL;DR

This paper introduces orthogonal finetuning via weight rotation for DPO, effectively reducing overfitting and improving alignment with human preferences while maintaining model diversity and using minimal additional parameters.

Contribution

It proposes a novel orthogonal finetuning method that preserves hyperspherical energy during preference optimization, enhancing alignment and diversity with minimal parameter updates.

Findings

01

Outperforms DPO on MT-Bench and AlpacaEval 2 benchmarks.

02

Reduces overfitting and maintains model diversity.

03

Uses only 0.0086% of trainable parameters.

Abstract

DPO is an effective preference optimization algorithm. However, the DPO-tuned models tend to overfit on the dispreferred samples, manifested as overly long generations lacking diversity. While recent regularization approaches have endeavored to alleviate this issue by modifying the objective function, they achieved that at the cost of alignment performance degradation. In this paper, we innovatively incorporate regularization from the perspective of weight updating to curb alignment overfitting. Through the pilot experiment, we discovered that there exists a positive correlation between overfitting and the hyperspherical energy fluctuation. Hence, we introduce orthogonal finetuning for DPO via a weight-Rotated Preference Optimization (RoPO) method, which merely conducts rotational and magnitude-stretching updates on the weight parameters to maintain the hyperspherical energy invariant,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsColor perception and design · Advanced Multi-Objective Optimization Algorithms · Optimization and Packing Problems

MethodsDirect Preference Optimization