Recursive Language Models
Alex L. Zhang, Tim Kraska, Omar Khattab

TL;DR
Recursive Language Models (RLMs) enable large language models to process much longer prompts by recursively examining and decomposing input snippets, significantly improving performance on long-context tasks.
Contribution
The paper introduces RLMs, a novel inference paradigm allowing LLMs to handle inputs far beyond their context window through recursive processing, and presents the first RLM-based model, RLM-Qwen3-8B.
Findings
RLMs process inputs up to 100 times longer than standard models.
RLM-Qwen3-8B outperforms the base Qwen3-8B by 28.3% on average.
RLMs outperform vanilla LLMs and scaffolds on four long-context tasks.
Abstract
We study allowing large language models (LLMs) to process arbitrarily long prompts through the lens of inference-time scaling. We propose Recursive Language Models (RLMs), a general inference paradigm that treats long prompts as part of an external environment and allows the LLM to programmatically examine, decompose, and recursively call itself over snippets of the prompt. We find that RLMs can successfully process inputs up to two orders of magnitude beyond model context windows and, even for shorter prompts, dramatically outperform the quality of vanilla frontier LLMs and common long-context and coding scaffolds (e.g., on GPT-5 by a median across the evaluated benchmarks of against compaction, against CodeAct with sub-calls, and against Claude Code) across four diverse long-context tasks while having comparable cost. At a small scale, we post-train the first…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
