Intuitive Fine-Tuning: Towards Simplifying Alignment into a Single Process
Ermo Hua, Biqing Qi, Kaiyan Zhang, Kai Tian, Xingtai Lv, Ning Ding, Bowen Zhou

TL;DR
This paper introduces Intuitive Fine-Tuning (IFT), a unified approach that combines supervised fine-tuning and preference optimization into a single process, enhancing language model alignment with human preferences.
Contribution
The paper presents a novel interpretation of SFT and PO as sub-processes within an MDP, and proposes IFT, which integrates both into one process using a temporal residual connection.
Findings
IFT performs comparably or better than SFT and PO on several tasks.
IFT effectively captures the model's intuitive sense of entire answers.
Experimental results validate IFT's effectiveness in various generation tasks.
Abstract
Supervised Fine-Tuning (SFT) and Preference Optimization (PO) are key processes for aligning Language Models (LMs) with human preferences post pre-training. While SFT excels in efficiency and PO in effectiveness, they are often combined sequentially without integrating their optimization objectives. This approach ignores the opportunities to bridge their paradigm gap and take the strengths from both. In this paper, we interpret SFT and PO with two sub-processes -- Preference Estimation and Transition Optimization -- defined at token level within the Markov Decision Process (MDP). This modeling shows that SFT is only a special case of PO with inferior estimation and optimization. PO estimates the model's preference by its entire generation, while SFT only scores model's subsequent predicted tokens based on prior tokens from ground truth answer. These priors deviates from model's…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvancements in Photolithography Techniques
MethodsParrot optimizer: Algorithm and applications to medical problems · Shrink and Fine-Tune
