Efficient Long CoT Reasoning in Small Language Models

Zhaoyang Wang; Jinqi Jiang; Tian Qiu; Hui Liu; Xianfeng Tang; Huaxiu Yao

arXiv:2505.18440·cs.CL·June 19, 2025

Efficient Long CoT Reasoning in Small Language Models

Zhaoyang Wang, Jinqi Jiang, Tian Qiu, Hui Liu, Xianfeng Tang, Huaxiu Yao

PDF

Open Access

TL;DR

This paper introduces a method to prune redundant steps in long chain-of-thought reasoning, enabling small language models to learn efficient reasoning with reduced redundancy while maintaining performance.

Contribution

It proposes a simple pruning technique and on-policy data curation to help small models learn effective long CoT reasoning without unnecessary steps.

Findings

01

Reduces redundant reasoning steps in small models' CoT

02

Maintains competitive performance with fewer reasoning steps

03

Effective across multiple mathematical reasoning benchmarks

Abstract

Recent large reasoning models such as DeepSeek-R1 exhibit strong complex problems solving abilities by generating long chain-of-thought (CoT) reasoning steps. It is challenging to directly train small language models (SLMs) to emerge long CoT. Thus, distillation becomes a practical method to enable SLMs for such reasoning ability. However, the long CoT often contains a lot of redundant contents (e.g., overthinking steps) which may make SLMs hard to learn considering their relatively poor capacity and generalization. To address this issue, we propose a simple-yet-effective method to prune unnecessary steps in long CoT, and then employ an on-policy method for the SLM itself to curate valid and useful long CoT training data. In this way, SLMs can effectively learn efficient long CoT reasoning and preserve competitive performance at the same time. Experimental results across a series of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSemantic Web and Ontologies · Natural Language Processing Techniques · Topic Modeling