The Detection-Extraction Gap: Models Know the Answer Before They Can Say It

Hanyang Wang; Mingxuan Zhu

arXiv:2604.06613·cs.CL·April 10, 2026

The Detection-Extraction Gap: Models Know the Answer Before They Can Say It

Hanyang Wang, Mingxuan Zhu

PDF

1 Repo

TL;DR

This paper uncovers the detection-extraction gap in reasoning models, showing they often generate beyond the answer point and proposing a method to improve extraction efficiency and accuracy using free continuations.

Contribution

It formalizes the detection-extraction gap phenomenon and introduces BAEE, a black-box early exit method that reduces generation steps while enhancing answer extraction accuracy.

Findings

01

52--88% of chain-of-thought tokens are produced after the answer is recoverable.

02

BAEE reduces serial generation by 70--78% and improves accuracy by 1--5 percentage points.

03

Early exit prevents overwriting in thinking-mode models, gaining up to 5.8pp.

Abstract

Modern reasoning models continue generating long after the answer is already determined. Across five model configurations, two families, and three benchmarks, we find that 52--88% of chain-of-thought tokens are produced after the answer is recoverable from a partial prefix. This post-commitment generation reveals a structural phenomenon: the detection-extraction gap. Free continuations from early prefixes recover the correct answer even at 10% of the trace, while forced extraction fails on 42% of these cases. The answer is recoverable from the model state, yet prompt-conditioned decoding fails to extract it. We formalize this mismatch via a total-variation bound between free and forced continuation distributions, yielding quantitative estimates of suffix-induced shift. Exploiting this asymmetry, we propose Black-box Adaptive Early Exit (BAEE), which uses free continuations for both…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

EdWangLoDaSc/know2say
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.