Reasoning Can Be Restored by Correcting a Few Decision Tokens

Changshuo Shen; Leheng Sheng; Yuxin Chen; An Zhang; Xiang Wang

arXiv:2605.16874·cs.AI·May 19, 2026

Reasoning Can Be Restored by Correcting a Few Decision Tokens

Changshuo Shen, Leheng Sheng, Yuxin Chen, An Zhang, Xiang Wang

PDF

1 Repo

TL;DR

This paper identifies that base models struggle mainly at early planning decision points during reasoning tasks and proposes a targeted token intervention method to improve their reasoning performance.

Contribution

It introduces a disagreement-guided token intervention approach that selectively delegates decision tokens to a reasoning model, significantly enhancing base model reasoning capabilities.

Findings

01

8% of tokens account for most disagreement

02

Early planning tokens are critical for reasoning success

03

Sparse intervention can surpass reasoning model performance

Abstract

Large reasoning models (LRMs) substantially outperform their base LLM counterparts on challenging reasoning benchmarks, yet it remains poorly understood where base models go wrong during token-by-token generation and how to narrow this gap efficiently. We study the base-reasoning gap through quantifying token-level distributional disagreement between a base model and a stronger reasoning model using likelihood-based divergences. Across benchmarks, we find that the reasoning advantage is highly sparse and concentrates on a small set of early, planning-related decision tokens. For instance, on Qwen3-0.6B, only ~8% of generated tokens account for the salient disagreement, and these tokens concentrate early in the response, are strongly enriched in planning-related decisions (17x), and coincide with high base-model uncertainty -- suggesting that base models fail mainly at early planning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

AlphaLab-USTC/RRTokenIntervention
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.