Single-Agent LLMs Outperform Multi-Agent Systems on Multi-Hop Reasoning Under Equal Thinking Token Budgets

Dat Tran; Douwe Kiela

arXiv:2604.02460·cs.CL·April 14, 2026

Single-Agent LLMs Outperform Multi-Agent Systems on Multi-Hop Reasoning Under Equal Thinking Token Budgets

Dat Tran, Douwe Kiela

PDF

TL;DR

This study shows that single-agent large language models often outperform multi-agent systems in multi-hop reasoning when considering equal reasoning token budgets, challenging the perceived advantages of multi-agent architectures.

Contribution

The paper provides an information-theoretic framework and empirical evidence demonstrating that single-agent systems are more information-efficient under fixed token budgets, questioning prior multi-agent system claims.

Findings

01

Single-agent models match or outperform multi-agent systems on reasoning tasks with equal token budgets.

02

API-based budget controls and standard benchmarks can inflate multi-agent system performance.

03

Multi-agent advantages are often due to unaccounted computation and context effects, not architecture.

Abstract

Recent work reports strong performance from multi-agent LLM systems (MAS), but these gains are often confounded by increased test-time computation. When computation is normalized, single-agent systems (SAS) can match or outperform MAS, yet the theoretical basis and evaluation methodology behind this comparison remain unclear. We present an information-theoretic argument, grounded in the Data Processing Inequality, suggesting that under a fixed reasoning-token budget and with perfect context utilization, single-agent systems are more information-efficient. This perspective further predicts that multi-agent systems become competitive when a single agent's effective context utilization is degraded, or when more compute is expended. We test these predictions in a controlled empirical study across three model families (Qwen3, DeepSeek-R1-Distill-Llama, and Gemini 2.5), comparing SAS with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.