The Cost of Consensus: Isolated Self-Correction Prevails Over Unguided Homogeneous Multi-Agent Debate
Bla\v{z} Bertalani\v{c}, Carolina Fortuna

TL;DR
This study empirically compares homogeneous multi-agent debate with isolated self-correction in large language models, revealing that debate often leads to conformity, fragility, and higher costs without improving accuracy.
Contribution
It provides a detailed analysis of failure modes in homogeneous debate among LLMs and demonstrates that self-correction is more cost-effective for accuracy.
Findings
Peer debate increases token consumption by 2.1-3.4 times compared to self-correction.
Conformity reaches 85.5% at minimal peer exposure, reducing diversity.
Homogeneous debate does not outperform isolated self-correction in accuracy for 7-8B models.
Abstract
Multi-agent debate, where teams of LLMs iteratively exchange rationales and vote on answers, is widely deployed under the assumption that peer review filters hallucinations. Yet the failure dynamics of homogeneous debate remain poorly understood, therefore we report findings from a controlled empirical study of teams of homogeneous agents (Qwen2.5-7B, Llama-3.1-8B, Ministral-3-8B) across debate rounds on two high-difficulty benchmarks (GSM-Hard and MMLU-Hard). We compare peer debate against isolated self-correction and a stochastic noise control that injects rationales from unrelated problems. We decompose debate failure into three model-dependent pathways: sycophantic conformity, where agents uncritically adopt majority answers (modal adoption up to 85.5%); contextual fragility, where peer rationales destabilize previously correct reasoning (vulnerability rate up to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
