Pseudo-Formalization for Automatic Proof Verification
Slim Barkallah, Luke Bailey, Kaiyue Wen, Mohammed Abouzaid, Tengyu Ma

TL;DR
This paper introduces Pseudo-Formalization, a flexible proof format that combines natural language and formal verification, enabling AI systems to verify complex mathematical proofs more effectively.
Contribution
The paper proposes Pseudo-Formalization and Block Verification, a novel approach to verify natural language proofs by translating them into a modular, formal-like format.
Findings
PF+BV outperforms LLM-as-judge baselines in error detection
PF+BV achieves better precision and recall on benchmark datasets
Research-level proof verification benchmark ArxivMathGradingBench is released
Abstract
Reliable verification of proofs remains a bottleneck for training and evaluating AI systems on hard mathematical reasoning. Fully formal proofs, in languages like Lean, are easy to verify because they are unambiguous and modular. Most proofs, particularly those written by AI systems, have neither property, and translating them into formal languages remains challenging in many frontier math settings. We propose Pseudo-Formalization (PF), a proof format that captures the modularity and precision of formal proofs while retaining the flexibility of natural language. A Pseudo-Formal proof is decomposed into self-contained modules, each stating its premises, conclusion, and proof in natural language. To verify the correctness of a regular natural language proof, an LLM translates it to Pseudo-Formal and then verifies each module independently, an algorithm we call Block Verification (BV). We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
