Not All Tokens Matter: Towards Efficient LLM Reasoning via Token Significance in Reinforcement Learning

Hanbing Liu; Lang Cao; Yuanyi Ren; Mengyu Zhou; Haoyu Dong; Xiaojun Ma; Shi Han; Dongmei Zhang

arXiv:2506.08125·cs.LG·April 20, 2026

Not All Tokens Matter: Towards Efficient LLM Reasoning via Token Significance in Reinforcement Learning

Hanbing Liu, Lang Cao, Yuanyi Ren, Mengyu Zhou, Haoyu Dong, Xiaojun Ma, Shi Han, Dongmei Zhang

PDF

1 Models

TL;DR

This paper introduces a token significance-aware reinforcement learning framework that reduces unnecessary explanation length in LLM reasoning, improving efficiency without sacrificing accuracy.

Contribution

It proposes a novel significance-aware length reward and a dynamic reward schedule to optimize token importance during LLM reasoning training.

Findings

01

Substantial reduction in response length across benchmarks.

02

Maintains or improves reasoning correctness.

03

Highlights importance of token significance modeling.

Abstract

Large language models (LLMs) show strong reasoning abilities but often produce unnecessarily long explanations that reduce efficiency. Although reinforcement learning (RL) has been used to improve reasoning, most methods focus on accuracy and rely on uniform length-based rewards that overlook the differing contributions of individual tokens, often harming correctness. We revisit length optimization in RL through the perspective of token significance. Observing that many chain-of-thought (CoT) tokens contribute little to the final answer, we introduce a significance-aware length reward that selectively penalizes insignificance tokens, reducing redundancy while preserving essential reasoning. We also propose a dynamic length reward that encourages more detailed reasoning early in training and gradually shifts toward conciseness as learning progresses. Integrating these components into…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
hanbing0/Bingo
model

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.