Preference-Oriented Supervised Fine-Tuning: Favoring Target Model Over   Aligned Large Language Models

Yuchen Fan; Yuzhong Hong; Qiushi Wang; Junwei Bao; Hongfei Jiang; and; Yang Song

arXiv:2412.12865·cs.CL·December 18, 2024

Preference-Oriented Supervised Fine-Tuning: Favoring Target Model Over Aligned Large Language Models

Yuchen Fan, Yuzhong Hong, Qiushi Wang, Junwei Bao, Hongfei Jiang, and, Yang Song

PDF

Open Access 1 Repo

TL;DR

This paper introduces PoFT, a preference-oriented supervised fine-tuning method that encourages models to outperform aligned LLMs on the same data, improving instruction-following capabilities without relying solely on high-quality datasets.

Contribution

PoFT is a novel fine-tuning approach that incorporates preference for the target model over aligned LLMs, enhancing performance and data efficiency in instruction tuning.

Findings

01

PoFT outperforms standard SFT baselines across multiple datasets and models.

02

PoFT can be combined with data filtering and preference optimization techniques like DPO.

03

Extensive experiments validate the effectiveness and stability of PoFT.

Abstract

Alignment, endowing a pre-trained Large language model (LLM) with the ability to follow instructions, is crucial for its real-world applications. Conventional supervised fine-tuning (SFT) methods formalize it as causal language modeling typically with a cross-entropy objective, requiring a large amount of high-quality instruction-response pairs. However, the quality of widely used SFT datasets can not be guaranteed due to the high cost and intensive labor for the creation and maintenance in practice. To overcome the limitations associated with the quality of SFT datasets, we introduce a novel \textbf{p}reference-\textbf{o}riented supervised \textbf{f}ine-\textbf{t}uning approach, namely PoFT. The intuition is to boost SFT by imposing a particular preference: \textit{favoring the target model over aligned LLMs on the same SFT data.} This preference encourages the target model to predict…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Savannah120/alignment-handbook-PoFT
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis

MethodsBalanced Selection · Direct Preference Optimization · Shrink and Fine-Tune