Self-Enhanced Reasoning Training: Activating Latent Reasoning in Small Models for Enhanced Reasoning Distillation
Yong Zhang, Bingyuan Zhang, Zhitao Li, Ming Li, Ning Cheng, Minchuan, Chen, Tao Wei, Jun Ma, Shaojun Wang, Jing Xiao

TL;DR
This paper introduces Self-Enhanced Reasoning Training (SERT), a method that activates and leverages the latent reasoning capabilities of small models through self-training on their own high-quality reasoning paths, improving their reasoning performance.
Contribution
The paper proposes SERT, a novel training approach that enhances small models' reasoning abilities by utilizing their latent reasoning paths without relying on chain-of-thought prompting.
Findings
SERT improves small models' reasoning performance in distillation tasks.
Small models can generate high-quality reasoning paths during sampling.
Self-training on latent reasoning paths enhances small models' reasoning capabilities.
Abstract
The rapid advancement of large language models (LLMs) has significantly enhanced their reasoning abilities, enabling increasingly complex tasks. However, these capabilities often diminish in smaller, more computationally efficient models like GPT-2. Recent research shows that reasoning distillation can help small models acquire reasoning capabilities, but most existing methods focus primarily on improving teacher-generated reasoning paths. Our observations reveal that small models can generate high-quality reasoning paths during sampling, even without chain-of-thought prompting, though these paths are often latent due to their low probability under standard decoding strategies. To address this, we propose Self-Enhanced Reasoning Training (SERT), which activates and leverages latent reasoning capabilities in small models through self-training on filtered, self-generated reasoning paths…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Topic Modeling · Advanced Text Analysis Techniques
