The Reasoning Trap: An Information-Theoretic Bound on Closed-System Multi-Step LLM Reasoning
Kwan Soo Shin

TL;DR
This paper introduces an information-theoretic framework to analyze the limitations of multi-step reasoning in language models, highlighting the 'Reasoning Trap' where reasoning quality degrades over iterations.
Contribution
It proposes a new theoretical bound on closed-system reasoning, introduces metrics and methods for evidence-grounded reasoning, and empirically validates these concepts across multiple datasets.
Findings
DebateCV preserves 88% of baseline accuracy while SFS drops 43%.
Majority-vote MAD reduces SFS to 1.7%, EGSR recovers 98%.
Human agreement on faithfulness metrics is unstable across languages and domains.
Abstract
When copies of the same language model are prompted to debate, they produce diverse phrasings of one perspective rather than diverse perspectives. Multi-agent debate (MAD), and more broadly closed-system reasoning where agents iteratively transform each other's outputs, tends to preserve answer accuracy while degrading the reasoning behind those answers. We name the multi-agent case the Debate Trap and the broader phenomenon the Reasoning Trap, offering a programmatic theory of evidence-grounded reasoning failure.The framework has three parts: (i) SFS (Supported Faithfulness Score), a claim-level metric verifying decomposed atomic claims against provided evidence (decomposer-invariant rankings: Spearman rho=1.0); (ii) EGSR (Evidence-Grounded Socratic Reasoning), replacing adversarial argumentation with evidence-grounded inquiry; (iii) Theorem 1 (DPI Bound): under standard MAD, the chain…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
