Multi-objective Large Language Model Alignment with Hierarchical Experts

Zhuo Li; Guodong Du; Weiyang Guo; Yigeng Zhou; Xiucheng Li; Wenya Wang; Fangming Liu; Yequan Wang; Deheng Ye; Min Zhang; Jing Li

arXiv:2505.20925·cs.CL·May 28, 2025

Multi-objective Large Language Model Alignment with Hierarchical Experts

Zhuo Li, Guodong Du, Weiyang Guo, Yigeng Zhou, Xiucheng Li, Wenya Wang, Fangming Liu, Yequan Wang, Deheng Ye, Min Zhang, Jing Li

PDF

Open Access 3 Reviews

TL;DR

This paper introduces HoE, a lightweight, plug-and-play hierarchical mixture-of-experts method that enables large language models to efficiently balance multiple conflicting objectives without retraining, improving alignment across diverse preferences.

Contribution

HoE is a novel hierarchical mixture-of-experts approach that allows LLMs to adapt across the entire Pareto frontier without retraining, balancing multiple objectives efficiently.

Findings

01

Outperforms 15 recent baselines on 14 objectives and 200 preferences.

02

Achieves optimal Pareto frontiers with reduced training cost.

03

Demonstrates versatility across various tasks and benchmarks.

Abstract

Aligning large language models (LLMs) to simultaneously satisfy multiple objectives remains a significant challenge, especially given the diverse and often conflicting nature of human preferences. Existing alignment methods struggle to balance trade-offs effectively, often requiring costly retraining or yielding suboptimal results across the Pareto frontier of preferences. In this paper, we introduce \textit{HoE}(Hierarchical Mixture-of-Experts), a \textit{lightweight}, \textit{parameter-efficient}, and \textit{plug-and-play} approach that eliminates the need for model training, while enabling LLMs to adapt across the entire Pareto frontier and accommodate diverse user preferences. In particular, \textit{HoE} consists of three hierarchical components: LoRA Experts, Router Experts and Preference Routing, reaching optimal Pareto frontiers and achieving a trade-off between parameter size,…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 6Confidence 3

Strengths

The paper is well-written and easy to follow. A number of different NLP tasks were taken to evaluate the performance of the proposed framework. A number of datasets were used to conduct the experiments.

Weaknesses

See the below questions.

Reviewer 02Rating 6Confidence 3

Strengths

1. The core architectural idea of a hierarchical Mixture-of-Experts for MOA, inspired by decomposition methods, is novel. 2. HOE achieves state-of-the-art performance, consistently dominating the Pareto frontiers of 15 competitive baselines (including RS, MOD, and RiC) in 2-objective settings. This is a very strong empirical contribution. 3. The paper features high-quality ablation studies that provide clear insights into the model's components.

Weaknesses

1. The paper's central claim is the achievement of "optimal Pareto frontiers" and "superior Pareto-optimal results". However, the paper provides no evidence that the proposed HOE method actually converges to the true, global Pareto optimal frontier. The theoretical analysis in Appendix G relies on strong assumptions, such as the convexity of the objective functions (Assumption G.1), which are well-known to not hold in the non-convex landscape of LLM optimization. While the use of Tchebycheff (TC

Reviewer 03Rating 4Confidence 3

Strengths

The paper propose a novel alignment approach named HOE. There are some strengths: * **Methodology**: This paper introduces a hierarchical expert-model framework (HOE) to handle multi-objective alignment, and incorporates Pareto-optimality concepts to provide a theoretical grounding for the approach. * **Scalable and extensible**: leverages model fusion techniques and a lightweight routing module to enable efficient training with lower resource costs. * **Experiments**: The evaluation covers mult

Weaknesses

However, there are some weakness of this paper: * **Methodology**: multi-objective LoRA experts are expected to learn different preferences, and the experimental results also validate it. However, the design of multi-objective router expert seems to be redundant. The ablation study only discusses a single router and the role and necessity of a multi-expert router are not clearly demonstrated. * **Reproducibility**: The results appear to rely on the pretrained model, yet the manuscript does not s

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques