Get an A in Math: Progressive Rectification Prompting
Zhenyu Wu, Meng Jiang, Chao Shen

TL;DR
Progressive Rectification Prompting (PRP) enhances large language models' ability to solve math word problems by iteratively verifying and correcting reasoning paths, significantly improving accuracy over traditional Chain-of-Thought methods.
Contribution
The paper introduces PRP, a novel iterative verify-then-rectify approach that reduces errors in LLM reasoning paths for math problems, outperforming existing CoT techniques.
Findings
PRP increases accuracy from 77.3% to 90.5% on eight datasets.
PRP effectively identifies and corrects reasoning mistakes.
PRP outperforms standard Chain-of-Thought prompting methods.
Abstract
Chain-of-Thought (CoT) prompting methods have enabled large language models (LLMs) to generate reasoning paths and solve math word problems (MWPs). However, they are sensitive to mistakes in the paths, as any mistake can result in an incorrect answer. We propose a novel method named Progressive Rectification Prompting (PRP) to improve average accuracy on eight MWP datasets from 77.3 to 90.5. Given an initial answer from CoT, PRP iterates a verify-then-rectify process to progressively identify incorrect answers and rectify the reasoning paths. With the most likely correct answer, the LLM predicts a masked numerical value in the question; if the prediction does not match the masked value, the answer is likely incorrect. Then the LLM is prompted to re-generate the reasoning path hinted with a set of incorrect answers to prevent itself from repeating previous mistakes. PRP achieves the best…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Online Learning and Analytics · Intelligent Tutoring Systems and Adaptive Learning
MethodsSparse Evolutionary Training
