Deep Self-Evolving Reasoning
Zihan Liu, Shun Zheng, Xumeng Wen, Yang Wang, Jiang Bian, Mao Yang

TL;DR
Deep Self-Evolving Reasoning (DSER) is a probabilistic iterative framework that enhances reasoning in smaller language models by amplifying small positive solution tendencies, enabling better problem-solving on challenging benchmarks.
Contribution
This work introduces DSER, a novel probabilistic paradigm that extends reasoning capabilities of open-weight models through iterative self-evolution, even with weak verification.
Findings
DSER solves 5 out of 9 previously unsolvable AIME problems.
It surpasses the single-turn accuracy of a 600B-parameter teacher model.
DSER reveals fundamental limitations in current open-weight reasoners.
Abstract
Long-form chain-of-thought reasoning has become a cornerstone of advanced reasoning in large language models. While recent verification-refinement frameworks have enabled proprietary models to solve Olympiad-level problems, their effectiveness hinges on strong, reliable verification and correction capabilities, which remain fragile in open-weight, smaller-scale models. This work demonstrates that even with weak verification and refinement capabilities on hard tasks, the reasoning limits of such models can be substantially extended through a probabilistic paradigm we call Deep Self-Evolving Reasoning (DSER). We conceptualize iterative reasoning as a Markov chain, where each step represents a stochastic transition in the solution space. The key insight is that convergence to a correct solution is guaranteed as long as the probability of improvement marginally exceeds that of degradation.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Natural Language Processing Techniques
