Procedural Knowledge at Scale Improves Reasoning

Di Wu; Devendra Singh Sachan; Wen-tau Yih; Mingda Chen

arXiv:2604.01348·cs.CL·April 21, 2026

Procedural Knowledge at Scale Improves Reasoning

Di Wu, Devendra Singh Sachan, Wen-tau Yih, Mingda Chen

PDF

TL;DR

This paper introduces Reasoning Memory, a retrieval-augmented framework that leverages a large datastore of procedural reasoning steps to improve language model performance on complex reasoning tasks.

Contribution

It presents a novel retrieval-based method that explicitly reuses procedural knowledge from a large corpus to enhance reasoning capabilities of language models.

Findings

01

Outperforms existing retrieval methods across six benchmarks.

02

Improves accuracy by up to 19.2% with higher inference budgets.

03

Key factors include broad procedural coverage and effective retrieval design.

Abstract

Test-time scaling has emerged as an effective way to improve language models on challenging reasoning tasks. However, most existing methods treat each problem in isolation and do not systematically reuse knowledge from prior reasoning trajectories. In particular, they underutilize procedural knowledge: how to reframe a problem, choose an approach, and verify or backtrack when needed. We introduce Reasoning Memory, a retrieval-augmented generation (RAG) framework for reasoning models that explicitly retrieves and reuses procedural knowledge at scale. Starting from existing corpora of step-by-step reasoning trajectories, we decompose each trajectory into self-contained subquestion-subroutine pairs, yielding a datastore of 32 million compact procedural knowledge entries. At inference time, a lightweight in-thought prompt lets the model verbalize the core subquestion, retrieve relevant…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.