Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach
Jonas Geiping, Sean McLeish, Neel Jain, John Kirchenbauer, Siddharth, Singh, Brian R. Bartoldson, Bhavya Kailkhura, Abhinav Bhatele, Tom Goldstein

TL;DR
This paper introduces a novel recurrent latent reasoning architecture that scales test-time computation by iteratively reasoning in latent space, enabling improved reasoning performance without extensive training data or large context windows.
Contribution
The paper presents a new recurrent latent reasoning model that scales test-time compute through iterative unrolling, achieving significant reasoning improvements without specialized training data.
Findings
Model scales to 3.5 billion parameters and 800 billion tokens.
Performance on reasoning benchmarks improves significantly.
Achieves reasoning capabilities comparable to much larger models.
Abstract
We study a novel language model architecture that is capable of scaling test-time computation by implicitly reasoning in latent space. Our model works by iterating a recurrent block, thereby unrolling to arbitrary depth at test-time. This stands in contrast to mainstream reasoning models that scale up compute by producing more tokens. Unlike approaches based on chain-of-thought, our approach does not require any specialized training data, can work with small context windows, and can capture types of reasoning that are not easily represented in words. We scale a proof-of-concept model to 3.5 billion parameters and 800 billion tokens. We show that the resulting model can improve its performance on reasoning benchmarks, sometimes dramatically, up to a computation load equivalent to 50 billion parameters.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗tomg-group-umd/huginn-0125model· 3.6k dl· ♡ 2903.6k dl♡ 290
- 🤗tomg-group-umd/step-00006144-recurrence_full_512_0model· 3 dl3 dl
- 🤗tomg-group-umd/step-00011904-recurrence_full_512_0model· 5 dl5 dl
- 🤗tomg-group-umd/step-00023808-recurrence_full_512_0model· 4 dl4 dl
- 🤗tomg-group-umd/step-00029824-recurrence_full_512_0model· 3 dl3 dl
- 🤗tomg-group-umd/step-00035840-recurrence_full_512_0model· 4 dl4 dl
- 🤗tomg-group-umd/step-00010720-baseline_2_0model· 5 dl5 dl
- 🤗tomg-group-umd/step-00006144-baseline_2_0model· 3 dl3 dl
- 🤗tomg-group-umd/step-00041728-recurrence_full_512_0model· 5 dl5 dl
- 🤗tomg-group-umd/step-00010752-recurrence_full_512_0model· 1 dl1 dl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Database Systems and Queries · Natural Language Processing Techniques · Graph Theory and Algorithms
