What Does Preference Learning Recover from Pairwise Comparison Data?

Rattana Pukdee; Maria-Florina Balcan; Pradeep Ravikumar

arXiv:2602.10286·cs.LG·February 12, 2026

What Does Preference Learning Recover from Pairwise Comparison Data?

Rattana Pukdee, Maria-Florina Balcan, Pradeep Ravikumar

PDF

Open Access

TL;DR

This paper investigates what the Bradley--Terry model recovers from pairwise comparison data, especially when data may violate model assumptions, by formalizing preference information and analyzing factors affecting learning efficiency.

Contribution

It formalizes the preference information in triplet data through CPRD and provides conditions under which BT modeling is appropriate, clarifying what is actually recovered.

Findings

01

Conditions for BT model appropriateness based on CPRD

02

Factors like margin and connectivity influence sample efficiency

03

Provides a data-centric understanding of preference learning

Abstract

Pairwise preference learning is central to machine learning, with recent applications in aligning language models with human preferences. A typical dataset consists of triplets $(x, y^{+}, y^{-})$ , where response $y^{+}$ is preferred over response $y^{-}$ for context $x$ . The Bradley--Terry (BT) model is the predominant approach, modeling preference probabilities as a function of latent score differences. Standard practice assumes data follows this model and learns the latent scores accordingly. However, real data may violate this assumption, and it remains unclear what BT learning recovers in such cases. Starting from triplet comparison data, we formalize the preference information it encodes through the conditional preference distribution (CPRD). We give precise conditions for when BT is appropriate for modeling the CPRD, and identify factors governing sample efficiency -- namely, margin and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Modeling and Causal Inference · Constraint Satisfaction and Optimization · Speech and dialogue systems