ConMax: Confidence-Maximizing Compression for Efficient Chain-of-Thought Reasoning

Minda Hu; Zexuan Qiu; Zenan Xu; Kun Li; Bo Zhou; Irwin King

arXiv:2601.04973·cs.AI·January 9, 2026

ConMax: Confidence-Maximizing Compression for Efficient Chain-of-Thought Reasoning

Minda Hu, Zexuan Qiu, Zenan Xu, Kun Li, Bo Zhou, Irwin King

PDF

Open Access

TL;DR

ConMax is a reinforcement learning framework that compresses reasoning traces in large models, reducing computational costs while maintaining high accuracy and reasoning quality.

Contribution

It introduces a novel reward-based compression method that preserves reasoning integrity and improves efficiency in large reasoning models.

Findings

01

Reduces inference length by 43% compared to baselines.

02

Maintains 99.3% of original accuracy after compression.

03

Demonstrates effectiveness across five reasoning datasets.

Abstract

Recent breakthroughs in Large Reasoning Models (LRMs) have demonstrated that extensive Chain-of-Thought (CoT) generation is critical for enabling intricate cognitive behaviors, such as self-verification and backtracking, to solve complex tasks. However, this capability often leads to ``overthinking'', where models generate redundant reasoning paths that inflate computational costs without improving accuracy. While Supervised Fine-Tuning (SFT) on reasoning traces is a standard paradigm for the 'cold start' phase, applying existing compression techniques to these traces often compromises logical coherence or incurs prohibitive sampling costs. In this paper, we introduce ConMax (Confidence-Maximizing Compression), a novel reinforcement learning framework designed to automatically compress reasoning traces while preserving essential reasoning patterns. ConMax formulates compression as a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Healthcare · AI-based Problem Solving and Planning · Multimodal Machine Learning Applications