When to Retrieve During Reasoning: Adaptive Retrieval for Large Reasoning Models

Dongxin Guo; Jikun Wu; Siu Ming Yiu

arXiv:2604.26649·cs.IR·April 30, 2026

When to Retrieve During Reasoning: Adaptive Retrieval for Large Reasoning Models

Dongxin Guo, Jikun Wu, Siu Ming Yiu

PDF

TL;DR

ReaLM-Retrieve is a novel framework that dynamically integrates retrieval during multi-step reasoning, significantly improving accuracy and efficiency in large reasoning models.

Contribution

It introduces a step-level uncertainty detector, a learned retrieval intervention policy, and an efficient integration mechanism for reasoning-aware retrieval.

Findings

01

Achieves 10.1% absolute improvement in answer F1 over standard RAG.

02

Reduces retrieval calls by 47% compared to fixed-interval approaches.

03

Improves retrieval quality with 81.3% Recall@5, establishing new efficiency-accuracy trade-offs.

Abstract

Large reasoning models such as DeepSeek-R1 and OpenAI o1 generate extended chains of thought spanning thousands of tokens, yet their integration with retrieval-augmented generation (RAG) remains fundamentally misaligned. Current RAG systems optimize for providing context before reasoning begins, while reasoning models require evidence injection during multi-step inference chains. We introduce ReaLM-Retrieve, a reasoning-aware retrieval framework that addresses this mismatch through three key innovations: (1) a step-level uncertainty detector that identifies knowledge gaps at reasoning-step granularity rather than token or sentence level; (2) a retrieval intervention policy that learns when external evidence maximally benefits ongoing reasoning; and (3) an efficiency-optimized integration mechanism that reduces per-retrieval overhead by 3.2x compared to naive integration. Experiments on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.