RecaLLM: Addressing the Lost-in-Thought Phenomenon with Explicit In-Context Retrieval

Kyle Whitecross; Negin Rahimi

arXiv:2604.09494·cs.CL·April 13, 2026

RecaLLM: Addressing the Lost-in-Thought Phenomenon with Explicit In-Context Retrieval

Kyle Whitecross, Negin Rahimi

PDF

2 Models 2 Datasets

TL;DR

RecaLLM is a reasoning language model that interleaves explicit in-context retrieval with reasoning steps to improve long-context understanding and address the lost-in-thought phenomenon, achieving strong results on benchmarks.

Contribution

It introduces a novel interleaving approach of reasoning and retrieval, along with a constrained decoding mechanism, to enhance long-context reasoning without extensive long-context training data.

Findings

01

RecaLLM significantly outperforms baselines on RULER and HELMET benchmarks.

02

It maintains strong performance with context windows up to 128K tokens using relatively short training samples.

03

Explicit in-context retrieval mitigates the lost-in-thought problem in long-context reasoning.

Abstract

We propose RecaLLM, a set of reasoning language models post-trained to make effective use of long-context information. In-context retrieval, which identifies relevant evidence from context, and reasoning are deeply intertwined: retrieval supports reasoning, while reasoning often determines what must be retrieved. However, their interaction remains largely underexplored. In preliminary experiments on several open-source LLMs, we observe that in-context retrieval performance substantially degrades even after a short reasoning span, revealing a key bottleneck for test-time scaling that we refer to as lost-in-thought: reasoning steps that improve performance also make subsequent in-context retrieval more challenging. To address this limitation, RecaLLM interleaves reasoning with explicit in-context retrieval, alternating between reasoning and retrieving context information needed to solve…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.