LLM Critics Help Catch Bugs in Mathematics: Towards a Better Mathematical Verifier with Natural Language Feedback
Bofei Gao, Zefan Cai, Runxin Xu, Peiyi Wang, Ce Zheng, Runji Lin,, Keming Lu, Dayiheng Liu, Chang Zhou, Wen Xiao, Junjie Hu, Tianyu Liu, Baobao, Chang

TL;DR
This paper introduces Math-Minos, a mathematical verifier enhanced with natural language feedback, which improves verification accuracy by providing detailed step-wise explanations rather than binary labels, advancing the reliability of mathematical reasoning models.
Contribution
The paper proposes a novel verifier that incorporates step-wise natural language feedback and a two-stage training paradigm, significantly improving verification performance over traditional binary-label methods.
Findings
Natural language feedback boosts verifier performance
Two-stage training enhances learning efficiency
Code and data are publicly released
Abstract
In recent progress, mathematical verifiers have achieved success in mathematical reasoning tasks by validating the correctness of solutions generated by policy models. However, existing verifiers are trained with binary classification labels, which are not informative enough for the model to accurately assess the solutions. To mitigate the aforementioned insufficiency of binary labels, we introduce step-wise natural language feedback as rationale labels, that is, the correctness of each step and the detailed explanations. In this paper, we propose Math-Minos, a natural language feedback-enhanced verifier by constructing automatically generated training data and a two-stage training paradigm for effective training and efficient inference. Our experiments reveal that a small set of natural language feedback can significantly boost the performance of the verifier in both verification and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsIntelligent Tutoring Systems and Adaptive Learning · Mathematics, Computing, and Information Processing
MethodsSparse Evolutionary Training
