Dual-Phase LLM Reasoning: Self-Evolved Mathematical Frameworks

ShaoZhen Liu; Xinting Huang; Houwen Peng; Xin Chen; Xinyang Song; Qi Li; Zhenan Sun

arXiv:2601.05616·cs.LG·January 12, 2026

Dual-Phase LLM Reasoning: Self-Evolved Mathematical Frameworks

ShaoZhen Liu, Xinting Huang, Houwen Peng, Xin Chen, Xinyang Song, Qi Li, Zhenan Sun

PDF

Open Access

TL;DR

This paper introduces a dual-phase training framework for large language models that enhances their reasoning abilities through self-generated data and difficulty-aware sampling, leading to improved performance on mathematical benchmarks.

Contribution

It proposes a novel two-stage training method combining self-generated chain-of-thought data and dynamic data filtering to boost LLM reasoning capabilities.

Findings

01

Extended reasoning chains over 4 times longer.

02

Improved performance on GSM8K and MATH500 benchmarks.

03

Enhanced handling of complex mathematical problems.

Abstract

In recent years, large language models (LLMs) have demonstrated significant potential in complex reasoning tasks like mathematical problem-solving. However, existing research predominantly relies on reinforcement learning (RL) frameworks while overlooking supervised fine-tuning (SFT) methods. This paper proposes a new two-stage training framework that enhances models' self-correction capabilities through self-generated long chain-of-thought (CoT) data. During the first stage, a multi-turn dialogue strategy guides the model to generate CoT data incorporating verification, backtracking, subgoal decomposition, and backward reasoning, with predefined rules filtering high-quality samples for supervised fine-tuning. The second stage employs a difficulty-aware rejection sampling mechanism to dynamically optimize data distribution, strengthening the model's ability to handle complex problems.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Advanced Graph Neural Networks · Multimodal Machine Learning Applications