Decomposing the Delta: What Do Models Actually Learn from Preference Pairs?
Chia-Hsuan Lee, Mingyang Zhou, Renkun Ni, Zelei Cheng, Sihui Dai, Supriyo Chakraborty, Shixiong Zhang, Sambit Sahu, and William Campbell

TL;DR
This paper investigates how different aspects of preference data, specifically generator-level and sample-level deltas, influence the reasoning capabilities of language models, offering strategies to enhance training effectiveness.
Contribution
It introduces a detailed analysis of preference data quality factors and provides practical recommendations for optimizing preference-based training for reasoning tasks.
Findings
Increasing generator-level delta improves out-of-domain reasoning performance.
Filtering data by sample-level delta enhances data efficiency.
Maximizing generator-level delta and exploiting sample-level delta improves reasoning models.
Abstract
Preference optimization methods such as DPO and KTO are widely used for aligning language models, yet little is understood about what properties of preference data drive downstream reasoning gains. We ask: what aspects of a preference pair improve a reasoning model's performance on general reasoning tasks? We investigate two distinct notions of quality delta in preference data: generator-level delta, arising from the differences in capability between models that generate chosen and rejected reasoning traces, and sample-level delta, arising from differences in judged quality differences within an individual preference pair. To study generator-level delta, we vary the generator's scale and model family, and to study sample-level delta, we employ an LLM-as-a-judge to rate the quality of generated traces along multiple reasoning-quality dimensions. We find that increasing generator-level…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
