When Agents Disagree: The Selection Bottleneck in Multi-Agent LLM Pipelines

Artem Maryanskyy

arXiv:2603.20324·cs.MA·March 24, 2026

When Agents Disagree: The Selection Bottleneck in Multi-Agent LLM Pipelines

Artem Maryanskyy

PDF

Open Access

TL;DR

This paper investigates the impact of diversity and selection methods in multi-agent LLM pipelines, revealing a crossover threshold that determines when diversity improves or worsens output quality, with judge-based selection outperforming synthesis-based methods.

Contribution

The paper introduces a theoretical crossover threshold model and provides empirical evidence showing judge-based selection surpasses synthesis in multi-agent LLM pipelines.

Findings

01

Diverse teams with judge-based selection outperform single models significantly.

02

Synthesis-based aggregation is less effective than judge-based selection across tasks.

03

Including weaker models can enhance performance and reduce costs.

Abstract

Multi-agent LLM pipelines produce contradictory evidence on whether team diversity improves output quality: heterogeneous Mixture-of-Agents teams outperform single models, yet homogeneous Self-MoA teams consistently win under synthesis-based aggregation. We propose a resolution by identifying the selection bottleneck -- a crossover threshold in aggregation quality that determines whether diversity helps or hurts. Under this model, we obtain a closed-form crossover threshold $s^{*}$ (Proposition 1) that separates the regimes where diversity helps and hurts. In a targeted experiment spanning 42 tasks across 7 categories ( $N = 210$ ), a diverse team with judge-based selection achieves a win rate of 0.810 against a single-model baseline, while a homogeneous team scores 0.512 -- near chance (Glass's $Δ = 2.07$ ). Judge-based selection outperforms MoA-style synthesis by $\Delta_{\mathrm{WR}} =…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMobile Crowdsensing and Crowdsourcing · Ethics and Social Impacts of AI · Multi-Agent Systems and Negotiation