Pareto Multi-Objective Alignment for Language Models
Qiang He, Setareh Maghsudi

TL;DR
This paper introduces Pareto Multi-Objective Alignment (PAMA), a scalable and efficient algorithm for multi-objective alignment of large language models, enabling them to balance conflicting objectives effectively.
Contribution
PAMA transforms multi-objective RLHF into a convex optimization with a closed-form solution, significantly reducing computational complexity and enabling practical multi-objective alignment for large language models.
Findings
PAMA achieves convergence to Pareto stationary points.
It reduces complexity from O(n^2*d) to O(n), enabling millisecond optimization.
Experiments demonstrate robust multi-objective alignment across models from 125M to 7B parameters.
Abstract
Large language models (LLMs) are increasingly deployed in real-world applications that require careful balancing of multiple, often conflicting, objectives, such as informativeness versus conciseness, or helpfulness versus creativity. However, current alignment methods, primarily based on RLHF, optimize LLMs toward a single reward function, resulting in rigid behavior that fails to capture the complexity and diversity of human preferences. This limitation hinders the adaptability of LLMs to practical scenarios, making multi-objective alignment (MOA) a critical yet underexplored area. To bridge this gap, we propose Pareto Multi-Objective Alignment (PAMA), a principled and computationally efficient algorithm designed explicitly for MOA in LLMs. In contrast to computationally prohibitive multi-objective optimization (MOO) methods, PAMA transforms multi-objective RLHF into a convex…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Natural Language Processing Techniques
