EPiC: Towards Lossless Speedup for Reasoning Training through Edge-Preserving CoT Condensation

Jinghan Jia; Hadi Reisizadeh; Chongyu Fan; Nathalie Baracaldo; Mingyi Hong; Sijia Liu

arXiv:2506.04205·cs.LG·June 5, 2025

EPiC: Towards Lossless Speedup for Reasoning Training through Edge-Preserving CoT Condensation

Jinghan Jia, Hadi Reisizadeh, Chongyu Fan, Nathalie Baracaldo, Mingyi Hong, Sijia Liu

PDF

Open Access 1 Repo 3 Reviews

TL;DR

EPiC introduces a method to condense chain-of-thought traces by preserving only the initial and final reasoning segments, significantly reducing training costs while maintaining reasoning accuracy.

Contribution

This work presents the first approach to thought-level CoT condensation, enabling resource-efficient training without sacrificing reasoning performance.

Findings

01

EPiC reduces training time by over 34%.

02

Maintains comparable reasoning accuracy to full CoT supervision.

03

Effective across multiple model families and benchmarks.

Abstract

Large language models (LLMs) have shown remarkable reasoning capabilities when trained with chain-of-thought (CoT) supervision. However, the long and verbose CoT traces, especially those distilled from large reasoning models (LRMs) such as DeepSeek-R1, significantly increase training costs during the distillation process, where a non-reasoning base model is taught to replicate the reasoning behavior of an LRM. In this work, we study the problem of CoT condensation for resource-efficient reasoning training, aimed at pruning intermediate reasoning steps (i.e., thoughts) in CoT traces, enabling supervised model training on length-reduced CoT data while preserving both answer accuracy and the model's ability to generate coherent reasoning. Our rationale is that CoT traces typically follow a three-stage structure: problem understanding, exploration, and solution convergence. Through…

Peer Reviews

Decision·Submitted to ICLR 2026

Reviewer 01Rating 2Confidence 3

Strengths

- The topic and motivation, resource-efficient reasoning training are important and timely. - The paper is well-written and easy to follow, with a clear presentation of the problem and method. - The authors conduct comprehensive empirical analyses, offering useful insights that intermediate reasoning steps may be less important than early or final stages.

Weaknesses

**[W1] Inconsistent and limited performance improvements.** In Table 2, the proposed method outperforms the baseline on MATH, AIME-25, and GPQA, but shows a significant drop on AIME-24. Moreover, efficiency metrics such as the number of tokens do not show clear advantages, following trends similar to competing methods. **[W2] Insufficient baselines and related work.** Recent studies have analyzed the importance of reasoning steps or proposed strategies (e.g., [1, 2]) that appear directly applic

Reviewer 02Rating 4Confidence 4

Strengths

S1: The research problem of this paper, the efficient training of reasoning models, is both timely and highly relevant. S2: The paper is well-written and easy to follow. Additionally, the proposed method is simple yet effective. S3: The authors conduct extensive experiments to verify the effectiveness of EPiC, and the reported results show that it achieves a strong utility-efficiency trade-off.

Weaknesses

W1: The experimental setup presented in Figure 2 seems questionable, which weakens the paper's motivation. The S1 and LIMO datasets are significantly smaller in the number of examples than OpenR1Math. Therefore, it is expected that models trained on S1 and LIMO would underperform those trained on OpenR1Math. A fairer comparison would involve ensuring that the total number of tokens in the condensed dataset (e.g., after 50% randomized thought-level condensation) is comparable to the token counts

Reviewer 03Rating 4Confidence 3

Strengths

* This paper introduced a framework for thought-level condensation that enables efficient knowledge distillation. It effectively captures reasoning thoughts using a smaller non-reasoning model. * The authors conducted analysis through visual illustrations and quantitative analysis to demonstrate that reasoning information is preserved. * Experiments were conducted to demonstrate the effectiveness of the proposed framework.

Weaknesses

* Figure 1 uses bar charts for both performance and efficiency, which is not appropriate since they belong to different categories and should be represented on separate axes. * Section 4 is crucial for the proposed method, but not all details are clearly presented. While all the key points are logical, they need to be clearly articulated and analyzed to make the techniques convincing. * In Figure 3, the example of CoT trace is interesting, where the authors pointed out the concerns of token b

Code & Models

Repositories

optml-group/epic
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Advanced Graph Neural Networks

MethodsBalanced Selection · Pruning