TL;DR
This paper introduces Certainty-Guided Reflection Suppression (CGRS), a method to reduce redundant reasoning steps in large reasoning language models, significantly decreasing token usage while maintaining reasoning accuracy.
Contribution
CGRS is a model-agnostic, no-retraining approach that dynamically suppresses reflection triggers based on confidence, improving efficiency in large reasoning language models.
Findings
Reduces token usage by 18.5% to 41.9%.
Maintains reasoning accuracy across benchmarks.
Effective across various model architectures and scales.
Abstract
Recent Large Reasoning Language Models (LRLMs) employ long chain-of-thought reasoning with complex reflection behaviors, typically signaled by specific trigger words (e.g., "Wait" and "Alternatively") to enhance performance. However, these reflection behaviors can lead to the overthinking problem where the generation of redundant reasoning steps that unnecessarily increase token usage, raise inference costs, and reduce practical utility. In this paper, we propose Certainty-Guided Reflection Suppression (CGRS), a novel method that mitigates overthinking in LRLMs while maintaining reasoning accuracy. CGRS operates by dynamically suppressing the model's generation of reflection triggers when it exhibits high confidence in its current response, thereby preventing redundant reflection cycles without compromising output quality. Our approach is model-agnostic, requires no retraining or…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
