PAL: Pluralistic Alignment Framework for Learning from Heterogeneous Preferences
Daiwei Chen, Yi Chen, Aniket Rege, Ramya Korlakai Vinayak

TL;DR
PAL introduces a novel framework for learning from diverse human preferences, enabling foundation models to better adapt to plurality of opinions and improve reward modeling efficiency across multiple domains.
Contribution
The paper proposes a preference modeling framework using the ideal point model and mixture modeling to capture preference plurality and generalize to unseen users, enhancing reward model training.
Findings
PAL achieves competitive reward model accuracy on language, image, and heterogeneous datasets.
The approach effectively captures diverse preferences and improves few-shot generalization.
Current preference datasets may oversimplify preferences, highlighting the need for nuanced data collection.
Abstract
Large foundation models pretrained on raw web-scale data are not readily deployable without additional step of extensive alignment to human preferences. Such alignment is typically done by collecting large amounts of pairwise comparisons from humans ("Do you prefer output A or B?") and learning a reward model or a policy with the Bradley-Terry-Luce (BTL) model as a proxy for a human's underlying implicit preferences. These methods generally suffer from assuming a universal preference shared by all humans, which lacks the flexibility of adapting to plurality of opinions and preferences. In this work, we propose PAL, a framework to model human preference complementary to existing pretraining strategies, which incorporates plurality from the ground up. We propose using the ideal point model as a lens to view alignment using preference comparisons. Together with our novel reformulation and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIntelligent Tutoring Systems and Adaptive Learning · Bayesian Modeling and Causal Inference · Logic, Reasoning, and Knowledge
