Stateful Reasoning via Insight Replay
Bin Lei, Caiwen Ding, Jiachen Yang, Ang Li, Xin Eric Wang

TL;DR
InsightReplay is a stateful reasoning method that enhances large language models by periodically re-accessing critical insights, significantly improving accuracy across diverse benchmarks and model scales.
Contribution
It introduces InsightReplay, a novel approach that maintains access to key insights during reasoning, addressing the decline in accuracy with longer chain-of-thoughts.
Findings
Achieves +1.65 average accuracy improvement over standard CoT.
Gains up to +9.2 points in specific settings.
Improves reasoning performance across multiple models and benchmarks.
Abstract
Chain-of-Thought (CoT) reasoning has become a foundation for eliciting multi-step reasoning in large language models, but recent studies show that its benefits do not scale monotonically with chain length: while longer CoT generally enables a model to tackle harder problems, on a given problem, accuracy typically increases with CoT length up to a point, after which it declines. We identify a major cause of this phenomenon: as the CoT grows, the model's attention to critical insights produced earlier in the trace gradually weakens, making those insights progressively less accessible when they are most needed. Therefore, we propose \textbf{InsightReplay}, a stateful reasoning approach in which the model periodically extracts critical insights from its reasoning trace and replays them near the active generation frontier, keeping them accessible as the reasoning scales. Extensive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
