When to Continue Thinking: Adaptive Thinking Mode Switching for Efficient Reasoning
Xiaoyun Zhang, Jingqing Ruan, Xing Ma, Yawen Zhu, Haodong Zhao, Hao Li, Jiansong Chen, Ke Zeng, Xunliang Cai

TL;DR
This paper introduces Adaptive Self-Recovery Reasoning (ASRR), a framework that dynamically adjusts reasoning effort in large reasoning models to improve efficiency and safety with minimal accuracy loss.
Contribution
The paper systematically analyzes reasoning modes in LRMs, uncovers implicit recovery mechanisms, and proposes ASRR to optimize reasoning effort adaptively based on task difficulty.
Findings
ASRR reduces reasoning budget by up to 32.5% with minimal accuracy loss.
ASRR significantly improves safety benchmark scores by up to +21.7%.
Models with ASRR maintain high accuracy while being more efficient.
Abstract
Large reasoning models (LRMs) achieve remarkable performance via long reasoning chains, but often incur excessive computational overhead due to redundant reasoning, especially on simple tasks. In this work, we systematically quantify the upper bounds of LRMs under both Long-Thinking and No-Thinking modes, and uncover the phenomenon of "Internal Self-Recovery Mechanism" where models implicitly supplement reasoning during answer generation. Building on this insight, we propose Adaptive Self-Recovery Reasoning (ASRR), a framework that suppresses unnecessary reasoning and enables implicit recovery. By introducing accuracy-aware length reward regulation, ASRR adaptively allocates reasoning effort according to problem difficulty, achieving high efficiency with negligible performance sacrifice. Experiments across multiple benchmarks and models show that, compared with GRPO, ASRR reduces…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
