Rethinking LLM Ensembling from the Perspective of Mixture Models

Jiale Fu; Yuchu Jiang; Peijun Wu; Chonghan Liu; Joey Tianyi Zhou; Xu Yang

arXiv:2605.00419·cs.LG·May 4, 2026

Rethinking LLM Ensembling from the Perspective of Mixture Models

Jiale Fu, Yuchu Jiang, Peijun Wu, Chonghan Liu, Joey Tianyi Zhou, Xu Yang

PDF

1 Repo

TL;DR

This paper introduces ME, a mixture-model perspective for LLM ensembling that improves efficiency by selecting a single model per token, achieving faster inference while maintaining ensemble benefits.

Contribution

It reinterprets LLM ensembling as a mixture model, enabling stochastic single-model sampling that reduces computational cost and links ensembling to token routing methods.

Findings

01

ME is 1.78x-2.68x faster than conventional ensemble.

02

ME is mathematically equivalent to sampling from the ensemble.

03

The approach connects ensembling with token-level routing methods.

Abstract

Model ensembling is a well-established technique for improving the performance of machine learning models. Conventionally, this involves averaging the output distributions of multiple models and selecting the most probable label. This idea has been naturally extended to large language models (LLMs), yielding improved performance but incurring substantial computational cost. This inefficiency stems from directly applying conventional ensemble implementation to LLMs, which require a separate forward pass for each model to explicitly compute the ensemble distribution. In this paper, we propose the Mixture-model-like Ensemble (ME). By reinterpreting the ensemble as a mixture model, ME stochastically selects a single model at each step to generate the next token, thereby avoiding the need to explicitly compute the full ensemble distribution. ME is mathematically equivalent to sampling from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jialefu/Mixture-model-like-Ensemble
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.