LaTER: Efficient Test-Time Reasoning via Latent Exploration and Explicit Verification

Xuan Li; Yining Wang; Yuchen Liu; Guanjun Liu; Delai Qiu; Shengping Liu; Jiaen Liang; Wei Huang; Jun Yu; Junnan Zhu

arXiv:2605.07315·cs.CL·May 11, 2026

LaTER: Efficient Test-Time Reasoning via Latent Exploration and Explicit Verification

Xuan Li, Yining Wang, Yuchen Liu, Guanjun Liu, Delai Qiu, Shengping Liu, Jiaen Liang, Wei Huang, Jun Yu, Junnan Zhu

PDF

1 Repo

TL;DR

LaTER is a two-stage reasoning paradigm that reduces token usage in large language models by combining latent exploration with explicit verification, improving efficiency and accuracy.

Contribution

It introduces a training-free latent reasoning approach and a supervised corpus, enhancing reasoning efficiency and accuracy over standard chain-of-thought methods.

Findings

01

Reduces token usage by 16%-32% on benchmarks.

02

Improves AIME 2025 accuracy from 70.0% to 73.3%.

03

Fine-tuning with LaTER achieves 80.0% accuracy on AIME 2025.

Abstract

Chain-of-thought (CoT) reasoning improves large language models (LLMs) on difficult tasks, but it also makes inference expensive because every intermediate step must be generated as a discrete token. Latent reasoning reduces visible token generation by propagating continuous states, yet replacing explicit derivations with latent computation can hurt tasks that require symbolic checking. We propose Latent-Then-Explicit Reasoning (LaTER), a two-stage paradigm that first performs bounded exploration in a continuous latent space and then switches to explicit CoT for verification and answer generation. In a training-free instantiation, LaTER projects final-layer hidden states back to the input embedding space, preserves the latent KV cache, and uses entropy and model-native stop-token probes to decide when to switch. We find that strong reasoning models already exhibit structured latent…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

TioeAre/LaTER
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.