Multi-Objective Preference Optimization: Improving Human Alignment of Generative Models
Akhil Agnihotri, Rahul Jain, Deepak Ramachandran, Zheng Wen

TL;DR
This paper introduces MOPO, a multi-objective preference optimization algorithm for aligning language models with multiple human objectives, improving upon single-objective methods by directly handling preference data and balancing conflicting goals.
Contribution
The paper presents MOPO, a novel multi-objective preference optimization method that operates on pairwise preference data, enabling better alignment with multiple human objectives without heuristic engineering.
Findings
MOPO approximates the Pareto front on synthetic benchmarks.
MOPO outperforms baselines in real-world human-preference fine-tuning.
Ablation studies show optimization stability and hyperparameter robustness.
Abstract
Post-training of LLMs with RLHF, and subsequently preference optimization algorithms such as DPO, IPO, etc., made a big difference in improving human alignment. However, all such techniques can only work with a single (human) objective. In practice, human users have multiple objectives, such as helpfulness and harmlessness, and there is no natural way to aggregate them into a single objective. In this paper, we address the multi-objective preference-alignment problem, where a policy must optimize several, potentially conflicting, objectives. We introduce the Multi-Objective Preference Optimization (MOPO) algorithm, which frames alignment as a constrained KL-regularized optimization: the primary objective is maximized while secondary objectives are lower-bounded by tunable safety thresholds. Unlike prior work, MOPO operates directly on pairwise preference data, requires no point-wise…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsConstraint Satisfaction and Optimization · Advanced Multi-Objective Optimization Algorithms · Recommender Systems and Techniques
MethodsDirect Preference Optimization
