Learning When to Stop: Adaptive Latent Reasoning via Reinforcement Learning

Alex Ning; Yen-Ling Kuo; Gabe Gomes

arXiv:2511.21581·cs.LG·November 27, 2025

Learning When to Stop: Adaptive Latent Reasoning via Reinforcement Learning

Alex Ning, Yen-Ling Kuo, Gabe Gomes

PDF

Open Access 1 Models

TL;DR

This paper introduces an adaptive latent reasoning approach in Transformer models, optimizing reasoning length via reinforcement learning to reduce computation while maintaining accuracy.

Contribution

It develops a novel reinforcement learning method to adaptively determine latent reasoning length, improving efficiency and compressive reasoning capabilities in language models.

Findings

01

52% reduction in reasoning length without accuracy loss

02

Effective optimization of reasoning length via RL

03

Enhanced efficiency in latent reasoning models

Abstract

Latent reasoning represents a new development in Transformer language models that has shown potential in compressing reasoning lengths compared to chain-of-thought reasoning. By directly passing the information-rich previous final latent state into the next sequence, latent reasoning removes the restriction to human language tokens as the medium for reasoning. We develop adaptive-length latent reasoning models and introduce a post-SFT reinforcement-learning methodology to optimize latent reasoning length by minimizing reasoning length while maintaining accuracy. This, in turn, further reduces compute usage and raises the bar on the compressive capabilities of latent reasoning models. Experiments on the Llama 3.2 1B model and the GSM8K-Aug dataset show a $52%$ drop in total reasoning length with no penalty to accuracy. In future work, we plan to extend to additional models and datasets,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
Lapisbird/Llama-adaLR-model-latent-6-by-1
model· 10 dl
10 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Natural Language Processing Techniques