Goedel-Prover: A Frontier Model for Open-Source Automated Theorem Proving
Yong Lin, Shange Tang, Bohan Lyu, Jiayun Wu, Hongzhou Lin, Kaiyu Yang,, Jia Li, Mengzhou Xia, Danqi Chen, Sanjeev Arora, Chi Jin

TL;DR
Goedel-Prover is an open-source language model that significantly advances automated theorem proving by generating formal proofs for mathematical problems, outperforming previous models and providing extensive open datasets and proofs.
Contribution
The paper introduces Goedel-Prover, a novel open-source language model that achieves state-of-the-art performance in formal proof generation and provides large-scale datasets and proofs for the community.
Findings
Achieves 57.6% success rate on miniF2F without RL
Solves 7 problems on PutnamBench, ranking first
Provides 29.7K formal proofs in Lean Workbook
Abstract
We introduce Goedel-Prover, an open-source language model that achieves state-of-the-art (as of April 5 2025) performance in automated formal proof generation for mathematical problems. A key challenge in this field is the scarcity of formalized mathematical statements and proofs, which we address through the following approaches. First, we train LLMs to convert natural language math problems from the Numina dataset to equivalent formal statements in Lean 4. This process creates the dataset Goedel-Pset-v1, which includes 1.64 million formal statements. Next, we develop a large dataset of formal proofs by training a series of provers. Each new prover can prove many statements that previous ones could not, and these new proofs are added to the training set for the next prover. Finally, we obtain the dataset Goedel-Pset-v1-solved, which contains proofs for over 800K statements from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Distributed and Parallel Computing Systems · Logic, programming, and type systems
MethodsSparse Evolutionary Training
