Self-Reasoning Language Models: Unfold Hidden Reasoning Chains with Few Reasoning Catalyst

Hongru Wang; Deng Cai; Wanjun Zhong; Shijue Huang; Jeff Z. Pan; Zeming Liu; Kam-Fai Wong

arXiv:2505.14116·cs.CL·May 21, 2025

Self-Reasoning Language Models: Unfold Hidden Reasoning Chains with Few Reasoning Catalyst

Hongru Wang, Deng Cai, Wanjun Zhong, Shijue Huang, Jeff Z. Pan, Zeming Liu, Kam-Fai Wong

PDF

Open Access 1 Video

TL;DR

This paper introduces Self-Reasoning Language Models (SRLM) that synthesize and improve reasoning chains through self-training and few demonstration examples, significantly enhancing reasoning performance and stability across multiple tasks.

Contribution

The paper proposes SRLM, a novel approach where models generate and refine reasoning chains iteratively using minimal demonstrations, leading to improved reasoning capabilities.

Findings

01

SRLM achieves +2.5 points average improvement across five tasks.

02

More sampling during inference yields up to +7.89 points improvement.

03

SRLM demonstrates more diverse and creative reasoning paths.

Abstract

Inference-time scaling has attracted much attention which significantly enhance the performance of Large Language Models (LLMs) in complex reasoning tasks by increasing the length of Chain-of-Thought. These longer intermediate reasoning rationales embody various meta-reasoning skills in human cognition, such as reflection and decomposition, being difficult to create and acquire. In this work, we introduce \textit{Self-Reasoning Language Model} (SRLM), where the model itself can synthesize longer CoT data and iteratively improve performance through self-training. By incorporating a few demonstration examples (i.e., 1,000 samples) on how to unfold hidden reasoning chains from existing responses, which act as a reasoning catalyst, we demonstrate that SRLM not only enhances the model's initial performance but also ensures more stable and consistent improvements in subsequent iterations. Our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Self-Reasoning Language Models: Unfold Hidden Reasoning Chains with Few Reasoning Catalyst· underline

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Semantic Web and Ontologies

MethodsSoftmax · Attention Is All You Need