EPiC: Towards Lossless Speedup for Reasoning Training through Edge-Preserving CoT Condensation
Jinghan Jia, Hadi Reisizadeh, Chongyu Fan, Nathalie Baracaldo, Mingyi Hong, Sijia Liu

TL;DR
EPiC introduces a method to condense chain-of-thought traces by preserving only the initial and final reasoning segments, significantly reducing training costs while maintaining reasoning accuracy.
Contribution
This work presents the first approach to thought-level CoT condensation, enabling resource-efficient training without sacrificing reasoning performance.
Findings
EPiC reduces training time by over 34%.
Maintains comparable reasoning accuracy to full CoT supervision.
Effective across multiple model families and benchmarks.
Abstract
Large language models (LLMs) have shown remarkable reasoning capabilities when trained with chain-of-thought (CoT) supervision. However, the long and verbose CoT traces, especially those distilled from large reasoning models (LRMs) such as DeepSeek-R1, significantly increase training costs during the distillation process, where a non-reasoning base model is taught to replicate the reasoning behavior of an LRM. In this work, we study the problem of CoT condensation for resource-efficient reasoning training, aimed at pruning intermediate reasoning steps (i.e., thoughts) in CoT traces, enabling supervised model training on length-reduced CoT data while preserving both answer accuracy and the model's ability to generate coherent reasoning. Our rationale is that CoT traces typically follow a three-stage structure: problem understanding, exploration, and solution convergence. Through…
Peer Reviews
Decision·Submitted to ICLR 2026
- The topic and motivation, resource-efficient reasoning training are important and timely. - The paper is well-written and easy to follow, with a clear presentation of the problem and method. - The authors conduct comprehensive empirical analyses, offering useful insights that intermediate reasoning steps may be less important than early or final stages.
**[W1] Inconsistent and limited performance improvements.** In Table 2, the proposed method outperforms the baseline on MATH, AIME-25, and GPQA, but shows a significant drop on AIME-24. Moreover, efficiency metrics such as the number of tokens do not show clear advantages, following trends similar to competing methods. **[W2] Insufficient baselines and related work.** Recent studies have analyzed the importance of reasoning steps or proposed strategies (e.g., [1, 2]) that appear directly applic
S1: The research problem of this paper, the efficient training of reasoning models, is both timely and highly relevant. S2: The paper is well-written and easy to follow. Additionally, the proposed method is simple yet effective. S3: The authors conduct extensive experiments to verify the effectiveness of EPiC, and the reported results show that it achieves a strong utility-efficiency trade-off.
W1: The experimental setup presented in Figure 2 seems questionable, which weakens the paper's motivation. The S1 and LIMO datasets are significantly smaller in the number of examples than OpenR1Math. Therefore, it is expected that models trained on S1 and LIMO would underperform those trained on OpenR1Math. A fairer comparison would involve ensuring that the total number of tokens in the condensed dataset (e.g., after 50% randomized thought-level condensation) is comparable to the token counts
* This paper introduced a framework for thought-level condensation that enables efficient knowledge distillation. It effectively captures reasoning thoughts using a smaller non-reasoning model. * The authors conducted analysis through visual illustrations and quantitative analysis to demonstrate that reasoning information is preserved. * Experiments were conducted to demonstrate the effectiveness of the proposed framework.
* Figure 1 uses bar charts for both performance and efficiency, which is not appropriate since they belong to different categories and should be represented on separate axes. * Section 4 is crucial for the proposed method, but not all details are clearly presented. While all the key points are logical, they need to be clearly articulated and analyzed to make the techniques convincing. * In Figure 3, the example of CoT trace is interesting, where the authors pointed out the concerns of token b
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Advanced Graph Neural Networks
MethodsBalanced Selection · Pruning
