More Is Not Always Better: Cross-Component Interference in LLM Agent Scaffolding
Ming Liu

TL;DR
This paper investigates how combining multiple components in LLM agent systems can cause destructive interactions, leading to suboptimal performance, and proposes interaction-aware subset selection for better task-specific configurations.
Contribution
It provides a comprehensive analysis of cross-component interference in LLM agents, demonstrating that fewer components can outperform all-in configurations and introducing methods for optimal subset selection.
Findings
All-In systems are often suboptimal compared to fewer components.
Optimal component subsets vary by task and scale.
Interactions among components can be reliably analyzed using regression and Shapley values.
Abstract
LLM agent systems are built by stacking scaffolding components (planning, tools, memory, self-reflection, retrieval) assuming more is better. We study cross-component interference (CCI): degradation when components interact destructively. We run a full factorial experiment over all 2^5=32 subsets of five components on HotpotQA and GSM8K with Llama-3.1-8B/70B (96 conditions, up to 10 seeds). The All-In system is consistently suboptimal: on HotpotQA, a single-tool agent surpasses All-In by 32% (F1 0.233 vs 0.177, p=0.023); on GSM8K, a 3-component subset beats All-In by 79% (0.43 vs 0.24, p=0.010). Optimal component count is task-dependent (k*=1-4) and scale-sensitive: at 70B, combinations that hurt at 8B provide gains, though All-In still trails the best subset. We fit a main-effects regression (R^2=0.916, adj-R^2=0.899, LOOCV=0.872), compute exact Shapley values, and find 183/325…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
