Multi-Objective Alignment of Large Language Models Through Hypervolume   Maximization

Subhojyoti Mukherjee; Anusha Lalitha; Sailik Sengupta; Aniket; Deshmukh; Branislav Kveton

arXiv:2412.05469·cs.LG·December 10, 2024

Multi-Objective Alignment of Large Language Models Through Hypervolume Maximization

Subhojyoti Mukherjee, Anusha Lalitha, Sailik Sengupta, Aniket, Deshmukh, Branislav Kveton

PDF

Open Access

TL;DR

This paper introduces HaM, an efficient algorithm for multi-objective alignment of large language models that maximizes hypervolume to produce diverse, high-quality solutions covering complex human preferences.

Contribution

It presents the first application of a-posteriori multi-objective optimization to human feedback in LLMs, enabling diverse solutions without prior preference knowledge.

Findings

01

HaM outperforms existing methods in multiple objectives

02

It efficiently covers the Pareto front of preferences

03

Empirical results show improvements in harmlessness, helpfulness, and faithfulness

Abstract

Multi-objective alignment from human feedback (MOAHF) in large language models (LLMs) is a challenging problem as human preferences are complex, multifaceted, and often conflicting. Recent works on MOAHF considered a-priori multi-objective optimization (MOO), where human preferences are known at training or inference time. In contrast, when human preferences are unknown or difficult to quantify, a natural approach is to cover the Pareto front by multiple diverse solutions. We propose an algorithm HaM for learning diverse LLM policies that maximizes their hypervolume. This is the first application of a-posteriori MOO to MOAHF. HaM is computationally and space efficient, and empirically superior across objectives such as harmlessness, helpfulness, humor, faithfulness, and hallucination, on various datasets.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Machine Learning and Data Classification