Deep Reasoning in General Purpose Agents via Structured Meta-Cognition
Dean Light, Michael Theologitis, Kshitish Ghate, Shuyue Stella Li, Benjamin Newman, Chirag Shah, Aylin Caliskan, Pang Wei Koh, Dan Suciu, Yulia Tsvetkov

TL;DR
The paper introduces Deep Reasoning, a method for constructing task-specific reasoning scaffolds at inference time using structured meta-reasoning, significantly improving performance on complex benchmarks.
Contribution
It presents a novel inference-time approach for adaptive scaffold construction in general-purpose agents, enabling flexible reasoning strategies tailored to each task.
Findings
DOLORES outperforms state-of-the-art scaffolding methods across four benchmarks.
It improves over the strongest baseline by 24.8% on average.
An 8B model surpasses 32B baselines in many settings.
Abstract
Humans intuitively solve complex problems by flexibly shifting among reasoning modes: they plan, execute, revise intermediate goals, resolve ambiguity through associative judgment, and apply formal procedures to well-specified subproblems. Current LLM agents lack this flexibility, as their scaffolds hard-code such reasoning decisions in advance. These scaffolds are effective when their prescribed structure matches the task, but brittle when solving the task requires adapting the structure of reasoning itself. We introduce Deep Reasoning -- an inference-time approach for constructing task-specific scaffolds through structured meta-reasoning. Deep Reasoning uses a formal language that represents meta-reasoning as executable decompositions over associative inference, formal computation, and recursive subproblem solving, enabling decomposition principles to be encoded as in-context examples…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
