Rethinking Mixture-of-Agents: Is Mixing Different Large Language Models   Beneficial?

Wenzhe Li; Yong Lin; Mengzhou Xia; Chi Jin

arXiv:2502.00674·cs.CL·February 4, 2025

Rethinking Mixture-of-Agents: Is Mixing Different Large Language Models Beneficial?

Wenzhe Li, Yong Lin, Mengzhou Xia, Chi Jin

PDF

Open Access

TL;DR

This paper questions the benefit of mixing different large language models in ensemble methods and proposes Self-MoA, which outperforms traditional mixtures by aggregating only the top-performing model, leading to state-of-the-art results.

Contribution

The paper introduces Self-MoA, a novel ensemble method that aggregates only the best LLM, demonstrating superior performance over traditional mixture-of-agents approaches.

Findings

01

Self-MoA outperforms standard MoA by 6.6% on AlpacaEval 2.0.

02

Self-MoA achieves an average of 3.8% improvement across various benchmarks.

03

Mixing different LLMs can lower average output quality, making top-only aggregation more effective.

Abstract

Ensembling outputs from diverse sources is a straightforward yet effective approach to boost performance. Mixture-of-Agents (MoA) is one such popular ensemble method that aggregates outputs from multiple different Large Language Models (LLMs). This paper raises the question in the context of language models: is mixing different LLMs truly beneficial? We propose Self-MoA -- an ensemble method that aggregates outputs from only the single top-performing LLM. Our extensive experiments reveal that, surprisingly, Self-MoA outperforms standard MoA that mixes different LLMs in a large number of scenarios: Self-MoA achieves $6.6%$ improvement over MoA on the AlpacaEval 2.0 benchmark, and an average of $3.8%$ improvement across various benchmarks, including MMLU, CRUX, and MATH. Applying Self-MoA to one of the top-ranking models in AlpacaEval 2.0 directly achieves the new state-of-the-art…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling