Determine-Then-Ensemble: Necessity of Top-k Union for Large Language   Model Ensembling

Yuxuan Yao; Han Wu; Mingyang Liu; Sichun Luo; Xiongwei Han; Jie Liu,; Zhijiang Guo; Linqi Song

arXiv:2410.03777·cs.CL·February 26, 2025

Determine-Then-Ensemble: Necessity of Top-k Union for Large Language Model Ensembling

Yuxuan Yao, Han Wu, Mingyang Liu, Sichun Luo, Xiongwei Han, Jie Liu,, Zhijiang Guo, Linqi Song

PDF

Open Access

TL;DR

This paper investigates the factors affecting large language model ensembling, emphasizing model compatibility, and introduces UniTE, a top-k union method that improves efficiency and performance in model combination.

Contribution

It identifies key determinants of ensemble effectiveness and proposes UniTE, a novel top-k union approach that simplifies model ensembling by avoiding full vocabulary alignment.

Findings

01

UniTE outperforms existing ensembling methods across benchmarks.

02

Model compatibility is crucial for effective LLM ensembling.

03

Top-k union reduces computational costs significantly.

Abstract

Large language models (LLMs) exhibit varying strengths and weaknesses across different tasks, prompting recent studies to explore the benefits of ensembling models to leverage their complementary advantages. However, existing LLM ensembling methods often overlook model compatibility and struggle with inefficient alignment of probabilities across the entire vocabulary. In this study, we empirically investigate the factors influencing ensemble performance, identifying model performance, vocabulary size, and response style as key determinants, revealing that compatibility among models is essential for effective ensembling. This analysis leads to the development of a simple yet effective model selection strategy that identifies compatible models. Additionally, we introduce the \textsc{Uni}on \textsc{T}op- $k$ \textsc{E}nsembling (\textsc{UniTE}), a novel approach that efficiently combines…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques