Not All Votes Count! Programs as Verifiers Improve Self-Consistency of Language Models for Math Reasoning
Vernon Y.H. Toh, Deepanway Ghosal, Soujanya Poria

TL;DR
Prove is a verification framework that uses translated programs from natural language solutions to filter incorrect reasoning paths, significantly improving the accuracy of open-source language models in mathematical reasoning tasks.
Contribution
This work introduces Prove, a novel verification method leveraging program translations to enhance self-consistency in language models for math reasoning.
Findings
Prove outperforms vanilla majority voting across all tested models and datasets.
Achieves up to 18% accuracy improvement on GSM8K.
Effective across models from 0.5B to 13B parameters.
Abstract
Large language models (LLMs) have shown increasing competence in solving mathematical reasoning problems. However, many open-source LLMs still struggle with errors in calculation and semantic understanding during intermediate reasoning steps. In this work, we introduce Prove, a simple yet effective framework that leverages translated programs derived from natural language solutions as a verification mechanism to filter out potentially incorrect reasoning paths before aggregating final answers. Unlike vanilla majority voting, our approach filters out solutions whose corresponding program output is inconsistent with the generated solution, aggregating only those that pass verification. We conducted extensive experiments using 13 open-source LLMs from various model families and sizes, ranging from 0.5B to 13B parameters, across eight mathematical benchmarks. Our results show that Prove…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel-Driven Software Engineering Techniques · Intelligent Tutoring Systems and Adaptive Learning · Natural Language Processing Techniques
