Efficient Reasoning for Large Reasoning Language Models via Certainty-Guided Reflection Suppression

Jiameng Huang; Baijiong Lin; Guhao Feng; Jierun Chen; Di He; and Lu Hou

arXiv:2508.05337·cs.CL·November 18, 2025

Efficient Reasoning for Large Reasoning Language Models via Certainty-Guided Reflection Suppression

Jiameng Huang, Baijiong Lin, Guhao Feng, Jierun Chen, Di He, and Lu Hou

PDF

1 Video

TL;DR

This paper introduces Certainty-Guided Reflection Suppression (CGRS), a method to reduce redundant reasoning steps in large reasoning language models, significantly decreasing token usage while maintaining reasoning accuracy.

Contribution

CGRS is a model-agnostic, no-retraining approach that dynamically suppresses reflection triggers based on confidence, improving efficiency in large reasoning language models.

Findings

01

Reduces token usage by 18.5% to 41.9%.

02

Maintains reasoning accuracy across benchmarks.

03

Effective across various model architectures and scales.

Abstract

Recent Large Reasoning Language Models (LRLMs) employ long chain-of-thought reasoning with complex reflection behaviors, typically signaled by specific trigger words (e.g., "Wait" and "Alternatively") to enhance performance. However, these reflection behaviors can lead to the overthinking problem where the generation of redundant reasoning steps that unnecessarily increase token usage, raise inference costs, and reduce practical utility. In this paper, we propose Certainty-Guided Reflection Suppression (CGRS), a novel method that mitigates overthinking in LRLMs while maintaining reasoning accuracy. CGRS operates by dynamically suppressing the model's generation of reflection triggers when it exhibits high confidence in its current response, thereby preventing redundant reflection cycles without compromising output quality. Our approach is model-agnostic, requires no retraining or…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Efficient Reasoning for Large Reasoning Language Models via Certainty-Guided Reflection Suppression· underline