ConCISE: Confidence-guided Compression in Step-by-step Efficient Reasoning

Ziqing Qiao; Yongheng Deng; Jiali Zeng; Dong Wang; Lai Wei; Guanbo Wang; Fandong Meng; Jie Zhou; Ju Ren; Yaoxue Zhang

arXiv:2505.04881·cs.LG·September 22, 2025

ConCISE: Confidence-guided Compression in Step-by-step Efficient Reasoning

Ziqing Qiao, Yongheng Deng, Jiali Zeng, Dong Wang, Lai Wei, Guanbo Wang, Fandong Meng, Jie Zhou, Ju Ren, Yaoxue Zhang

PDF

Open Access 1 Video

TL;DR

ConCISE is a framework that enhances large reasoning models by reducing verbose outputs through confidence-guided techniques, leading to more efficient reasoning with minimal accuracy loss.

Contribution

This work introduces ConCISE, a novel confidence-guided compression framework that effectively shortens reasoning chains while preserving performance in large reasoning models.

Findings

01

Reduces reasoning chain length by up to 50%

02

Maintains high task accuracy after compression

03

Improves efficiency of large reasoning models

Abstract

Large Reasoning Models (LRMs) perform strongly in complex reasoning tasks via Chain-of-Thought (CoT) prompting, but often suffer from verbose outputs, increasing computational overhead. Existing fine-tuning-based compression methods either operate post-hoc pruning, risking disruption to reasoning coherence, or rely on sampling-based selection, which fails to remove redundant content thoroughly. To address these limitations, this work begins by framing two key patterns of redundant reflection in LRMs--Confidence Deficit, wherein the model reflects on correct intermediate steps, and Termination Delay, where reflection continues after a verified, confident answer--through a confidence-guided perspective. Based on this, we introduce ConCISE (Confidence-guided Compression In Step-by-step Efficient Reasoning), a framework designed to generate concise reasoning chains, integrating Confidence…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

ConCISE: Confidence-guided Compression in Step-by-step Efficient Reasoning· underline

Taxonomy

TopicsMultimodal Machine Learning Applications · Topic Modeling · Explainable Artificial Intelligence (XAI)

MethodsEarly Stopping