AMoPO: Adaptive Multi-objective Preference Optimization without Reward Models and Reference Models
Qi Liu, Jingqing Ruan, Hao Li, Haodong Zhao, Desheng Wang, Jiansong Chen, Wan Guanglu, Xunliang Cai, Zhi Zheng, Tong Xu

TL;DR
AMoPO introduces a novel multi-objective optimization framework for aligning large language models with diverse preferences without relying on reward or reference models, using adaptive weighting based on a Gaussian distribution.
Contribution
The paper presents AMoPO, a new adaptive multi-objective preference optimization method that dynamically balances preference dimensions without auxiliary models, improving alignment efficiency and scalability.
Findings
Outperforms state-of-the-art baselines by 28.5%
Effective across models of 7B, 14B, and 32B parameters
Demonstrates strong adaptability and preference dimension management
Abstract
Existing multi-objective preference alignment methods for large language models (LLMs) face limitations: (1) the inability to effectively balance various preference dimensions, and (2) reliance on auxiliary reward/reference models introduces computational complexity. To address these challenges, we propose Adaptive Multi-objective Preference Optimization (AMoPO), a novel framework that achieves dynamic balance across preference dimensions. By introducing the multi-objective optimization paradigm to use the dimension-aware generation metrics as implicit rewards, AMoPO aligns LLMs with diverse preferences without additional reward models or reference models. We introduce an adaptive weight assignment mechanism that models the generation space as a Gaussian distribution, allowing dynamic prioritization of preference dimensions. Empirical results demonstrate that AMoPO outperforms…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Advanced Multi-Objective Optimization Algorithms · Machine Learning and Data Classification
