S3-CoT: Self-Sampled Succinct Reasoning Enables Efficient Chain-of-Thought LLMs

Yanrui Du; Sendong Zhao; Yibo Gao; Danyang Zhao; Qika Lin; Ming Ma; Jiayun Li; Yi Jiang; Kai He; Qianyi Xu; Bing Qin; Mengling Feng

arXiv:2602.01982·cs.CL·February 3, 2026

S3-CoT: Self-Sampled Succinct Reasoning Enables Efficient Chain-of-Thought LLMs

Yanrui Du, Sendong Zhao, Yibo Gao, Danyang Zhao, Qika Lin, Ming Ma, Jiayun Li, Yi Jiang, Kai He, Qianyi Xu, Bing Qin, Mengling Feng

PDF

Open Access 4 Models 1 Datasets

TL;DR

This paper introduces S3-CoT, a self-sampling framework that enables efficient chain-of-thought reasoning in large language models by reducing reliance on high-quality supervision data and mimicking human-like fast thinking.

Contribution

The paper proposes a novel self-sampling method for training LLMs to perform efficient CoT reasoning without extensive labeled data, using activation steering and a progressive curriculum.

Findings

01

Improved performance on math benchmarks and cross-domain tests.

02

Stable enhancements for both general and R1-style LLMs.

03

Effective reasoning trace induction without teacher guidance.

Abstract

Large language models (LLMs) equipped with chain-of-thought (CoT) achieve strong performance and offer a window into LLM behavior. However, recent evidence suggests that improvements in CoT capabilities often come with redundant reasoning processes, motivating a key question: Can LLMs acquire a fast-thinking mode analogous to human System 1 reasoning? To explore this, our study presents a self-sampling framework based on activation steering for efficient CoT learning. Our method can induce style-aligned and variable-length reasoning traces from target LLMs themselves without any teacher guidance, thereby alleviating a central bottleneck of SFT-based methods-the scarcity of high-quality supervision data. Using filtered data by gold answers, we perform SFT for efficient CoT learning with (i) a human-like dual-cognitive system, and (ii) a progressive compression curriculum. Furthermore, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Datasets

yrdu/S3-CoT-Self-Sampled-Data
dataset· 15 dl
15 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Machine Learning in Materials Science · Advanced Graph Neural Networks