S^3cMath: Spontaneous Step-level Self-correction Makes Large Language Models Better Mathematical Reasoners
Yuchen Yan, Jin Jiang, Yang Liu, Yixin Cao, Xin Xu, Mengdi Zhang, Xunliang Cai, Jian Shao

TL;DR
This paper introduces S^3cMath, a novel approach enabling large language models to spontaneously detect and correct errors during mathematical reasoning, significantly improving their accuracy and reliability.
Contribution
It presents the first method for spontaneous step-level self-correction in LLMs, enhancing their mathematical reasoning capabilities through a new training strategy and data construction.
Findings
Significant improvements on GSM8K and MATH benchmarks.
Effective across various foundation LLMs.
First demonstration of spontaneous self-correction in mathematical reasoning.
Abstract
Self-correction is a novel method that can stimulate the potential reasoning abilities of large language models (LLMs). It involves detecting and correcting errors during the inference process when LLMs solve reasoning problems. However, recent works do not regard self-correction as a spontaneous and intrinsic capability of LLMs. Instead, such correction is achieved through post-hoc generation, external knowledge introduction, multi-model collaboration, and similar techniques. In this paper, we propose a series of mathematical LLMs called Sc-Math, which are able to perform Spontaneous Step-level Self-correction for Mathematical reasoning. This capability helps LLMs to recognize whether their ongoing inference tends to contain errors and simultaneously correct these errors to produce a more reliable response. We proposed a method, which employs a step-level sampling approach to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsTopic Modeling · Mathematics, Computing, and Information Processing · AI-based Problem Solving and Planning
