Step-Wise Formal Verification for LLM-Based Mathematical Problem Solving

Kuo Zhou; Lu Zhang

arXiv:2505.20869·cs.AI·May 28, 2025

Step-Wise Formal Verification for LLM-Based Mathematical Problem Solving

Kuo Zhou, Lu Zhang

PDF

Open Access

TL;DR

This paper introduces MATH-VF, a framework combining formal verification and correction tools to improve the accuracy of LLM-generated solutions to mathematical problems, verified on standard benchmarks.

Contribution

It presents a novel step-wise verification framework for LLM-based math problem solving, integrating formal translation and external tools for correctness assessment and correction.

Findings

01

MATH-VF effectively verifies solutions on MATH500 and ProcessBench.

02

The framework improves solution accuracy over existing methods.

03

It enables iterative refinement of LLM solutions through formal verification.

Abstract

Large Language Models (LLMs) have demonstrated formidable capabilities in solving mathematical problems, yet they may still commit logical reasoning and computational errors during the problem-solving process. Thus, this paper proposes a framework, MATH-VF, which includes a Formalizer and a Critic, for formally verifying the correctness of the solutions generated by large language models. Our framework first utilizes a Formalizer which employs an LLM to translate a natural language solution into a formal context. Afterward, our Critic (which integrates various external tools such as a Computer Algebra System and an SMT solver) evaluates the correctness of each statement within the formal context, and when a statement is incorrect, our Critic provides corrective feedback. We empirically investigate the effectiveness of MATH-VF in two scenarios: 1) Verification: MATH-VF is utilized to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMathematics, Computing, and Information Processing · Intelligent Tutoring Systems and Adaptive Learning