Solving Formal Math Problems by Decomposition and Iterative Reflection
Yichi Zhou, Jianqiu Zhao, Yongxin Zhang, Bohan Wang, Siran Wang, Luoxin Chen, Jiahui Wang, Haowei Chen, Allan Jie, Xinbo Zhang, Haocheng Wang, Luong Trung, Rong Ye, Phan Nhat Hoang, Huishuai Zhang, Peng Sun, Hang Li

TL;DR
Delta Prover is an innovative agent-based framework that enables general-purpose LLMs to construct formal proofs in Lean 4 through decomposition and reflection, achieving state-of-the-art success without model fine-tuning.
Contribution
The paper introduces Delta Prover, a novel agent framework combining reflective decomposition and proof repair, allowing LLMs to effectively perform formal theorem proving in Lean 4 without specialized training.
Findings
Achieves 95.9% success rate on miniF2F benchmark.
Surpasses all existing approaches including specialized models.
Exhibits stronger test-time scaling law than standard methods.
Abstract
General-purpose Large Language Models (LLMs) have achieved remarkable success in intelligence, performing comparably to human experts on complex reasoning tasks such as coding and mathematical reasoning. However, generating formal proofs in specialized languages like Lean 4 remains a significant challenge for these models, limiting their application in complex theorem proving and automated verification. Current approaches typically require specializing models through fine-tuning on dedicated formal corpora, incurring high costs for data collection and training. In this work, we introduce \textbf{Delta Prover}, an agent-based framework that orchestrates the interaction between a general-purpose LLM and the Lean 4 proof environment. Delta Prover leverages the reflection and reasoning capabilities of general-purpose LLMs to interactively construct formal proofs in Lean 4, circumventing the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLogic, programming, and type systems · Mathematics, Computing, and Information Processing · Model-Driven Software Engineering Techniques
