SmartThinker: Progressive Chain-of-Thought Length Calibration for Efficient Large Language Model Reasoning

Chenzhi Hu; Qinzhe Hu; Yuhang Xu; Junyi Chen; Ruijie Wang; Shengzhong Liu; Jianxin Li; Fan Wu; Guihai Chen

arXiv:2603.08000·cs.CL·March 10, 2026

SmartThinker: Progressive Chain-of-Thought Length Calibration for Efficient Large Language Model Reasoning

Chenzhi Hu, Qinzhe Hu, Yuhang Xu, Junyi Chen, Ruijie Wang, Shengzhong Liu, Jianxin Li, Fan Wu, Guihai Chen

PDF

Open Access

TL;DR

SmartThinker introduces a dynamic chain-of-thought length calibration method for large reasoning models, reducing verbosity and improving accuracy by adaptively optimizing response length during training.

Contribution

It proposes a novel GRPO-based approach that dynamically estimates optimal reasoning length and adjusts reward coefficients to enhance efficiency and accuracy.

Findings

01

Achieves up to 52.5% length compression with better accuracy.

02

Improves accuracy by up to 16.6% on challenging benchmarks.

03

Effectively balances response length and reasoning quality.

Abstract

Large reasoning models (LRMs) like OpenAI o1 and DeepSeek-R1 achieve high accuracy on complex tasks by adopting long chain-of-thought (CoT) reasoning paths. However, the inherent verbosity of these processes frequently results in redundancy and overthinking. To address this issue, existing works leverage Group Relative Policy Optimization (GRPO) to reduce LRM output length, but their static length reward design cannot dynamically adapt according to the relative problem difficulty and response length distribution, causing over-compression and compromised accuracy. Therefore, we propose SmartThinker, a novel GRPO-based efficient reasoning method with progressive CoT length calibration. SmartThinker makes a two-fold contribution: First, it dynamically estimates the optimal length with peak accuracy during training and guides overlong responses toward it to reduce response length while…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Machine Learning in Materials Science