Self-supervised Preference Optimization: Enhance Your Language Model   with Preference Degree Awareness

Jian Li; Haojing Huang; Yujia Zhang; Pengfei Xu; Xi Chen; Rui Song,; Lida Shi; Jingwen Wang; Hao Xu

arXiv:2409.17791·cs.CL·September 27, 2024

Self-supervised Preference Optimization: Enhance Your Language Model with Preference Degree Awareness

Jian Li, Haojing Huang, Yujia Zhang, Pengfei Xu, Xi Chen, Rui Song,, Lida Shi, Jingwen Wang, Hao Xu

PDF

Open Access 1 Repo

TL;DR

This paper introduces a Self-supervised Preference Optimization framework that enhances Large Language Models' ability to understand preference degrees, leading to improved performance over existing methods in preference-based training.

Contribution

The paper proposes a novel self-supervised preference degree loss that, when combined with alignment loss, improves LLMs' understanding of human preference intensities.

Findings

01

SPO significantly boosts preference optimization performance.

02

SPO achieves state-of-the-art results on multiple datasets.

03

The framework is compatible with existing preference methods.

Abstract

Recently, there has been significant interest in replacing the reward model in Reinforcement Learning with Human Feedback (RLHF) methods for Large Language Models (LLMs), such as Direct Preference Optimization (DPO) and its variants. These approaches commonly use a binary cross-entropy mechanism on pairwise samples, i.e., minimizing and maximizing the loss based on preferred or dis-preferred responses, respectively. However, while this training strategy omits the reward model, it also overlooks the varying preference degrees within different responses. We hypothesize that this is a key factor hindering LLMs from sufficiently understanding human preferences. To address this problem, we propose a novel Self-supervised Preference Optimization (SPO) framework, which constructs a self-supervised preference degree loss combined with the alignment loss, thereby helping LLMs improve their…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lijian16/spo
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and dialogue systems · Natural Language Processing Techniques · Advanced Text Analysis Techniques