MixDPO: Modeling Preference Strength for Pluralistic Alignment

Saki Imai; Pedram Heydari; Anthony Sicilia; Asteria Kaeberlein; Katherine Atwell; Malihe Alikhani

arXiv:2601.06180·cs.LG·January 13, 2026

MixDPO: Modeling Preference Strength for Pluralistic Alignment

Saki Imai, Pedram Heydari, Anthony Sicilia, Asteria Kaeberlein, Katherine Atwell, Malihe Alikhani

PDF

Open Access

TL;DR

MixDPO introduces a novel approach to modeling varying preference strengths in language model alignment, improving performance and capturing heterogeneity in human judgments across diverse datasets.

Contribution

It generalizes Direct Preference Optimization by explicitly modeling preference strength variation, enhancing alignment with heterogeneous human preferences.

Findings

01

Improves aggregate alignment performance (+11.2 points on Pythia-2.8B)

02

Preserves subgroup preferences across datasets

03

Effectively captures preference heterogeneity in training data

Abstract

Preference based alignment objectives implicitly assume that all human preferences are expressed with equal strength. In practice, however, preference strength varies across individuals and contexts -- a phenomenon established in behavioral economics and discrete choice theory. This mismatch limits the ability of existing objectives to faithfully capture heterogeneous human judgments. Inspired by this literature, we introduce Mixed Logit Direct Preference Optimization (MixDPO), a generalization of Direct Preference Optimization that models variation in preference strength. MixDPO enables alignment objectives to capture heterogeneity in how strongly preferences are expressed across training examples. We evaluate MixDPO on three preference datasets using two open-weight language models. Across datasets, MixDPO improves aggregate alignment performance (+11.2 points on Pythia-2.8B) while…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMobile Crowdsensing and Crowdsourcing · Constraint Satisfaction and Optimization · Recommender Systems and Techniques