STP: Self-play LLM Theorem Provers with Iterative Conjecturing and   Proving

Kefan Dong; Tengyu Ma

arXiv:2502.00212·cs.LG·March 24, 2025

STP: Self-play LLM Theorem Provers with Iterative Conjecturing and Proving

Kefan Dong, Tengyu Ma

PDF

Open Access 1 Repo 2 Models 2 Datasets 1 Video

TL;DR

STP introduces a self-play framework for LLM-based theorem proving, where conjecturing and proving mutually improve, significantly enhancing proof success rates and achieving state-of-the-art results on multiple benchmarks.

Contribution

The paper presents a novel self-play approach for LLM theorem provers, enabling iterative conjecture generation and proof, leading to substantial performance improvements over prior methods.

Findings

01

Proves 28.5% of statements in LeanWorkbook, doubling previous best.

02

Achieves state-of-the-art on miniF2F-test, Proofnet-test, and PutnamBench.

03

Generated 51.3 billion tokens during training.

Abstract

A fundamental challenge in formal theorem proving by LLMs is the lack of high-quality training data. Although reinforcement learning or expert iteration partially mitigates this issue by alternating between LLM generating proofs and finetuning them on correctly generated ones, performance quickly plateaus due to the scarcity of correct proofs (sparse rewards). To keep improving the models with limited data, we draw inspiration from mathematicians, who continuously develop new results, partly by proposing novel conjectures or exercises (which are often variants of known results) and attempting to solve them. We design the Self-play Theorem Prover (STP) that simultaneously takes on two roles, conjecturer and prover, each providing training signals to the other. The conjecturer is trained iteratively on previously generated conjectures that are barely provable by the current prover, which…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kfdong/STP
jaxOfficial

Models

Datasets

Videos

STP: Self-play LLM Theorem Provers with Iterative Conjecturing and Proving· slideslive

Taxonomy

TopicsAuction Theory and Applications