StepFun-Prover Preview: Let's Think and Verify Step by Step

Shijie Shang; Ruosi Wan; Yue Peng; Yutong Wu; Xiong-hui Chen; Jie Yan; Xiangyu Zhang

arXiv:2507.20199·cs.AI·August 14, 2025

StepFun-Prover Preview: Let's Think and Verify Step by Step

Shijie Shang, Ruosi Wan, Yue Peng, Yutong Wu, Xiong-hui Chen, Jie Yan, Xiangyu Zhang

PDF

Open Access

TL;DR

StepFun-Prover Preview is a large language model that uses tool-integrated reinforcement learning to generate formal proofs efficiently, achieving high success rates on theorem proving benchmarks and advancing automated reasoning methods.

Contribution

The paper introduces a novel reinforcement learning framework for training language models with tool-based reasoning capabilities for theorem proving.

Findings

01

Achieves 70.0% pass@1 on miniF2F benchmark

02

Demonstrates effective human-like iterative proof refinement

03

Provides an end-to-end training framework for tool-integrated reasoning

Abstract

We present StepFun-Prover Preview, a large language model designed for formal theorem proving through tool-integrated reasoning. Using a reinforcement learning pipeline that incorporates tool-based interactions, StepFun-Prover can achieve strong performance in generating Lean 4 proofs with minimal sampling. Our approach enables the model to emulate human-like problem-solving strategies by iteratively refining proofs based on real-time environment feedback. On the miniF2F-test benchmark, StepFun-Prover achieves a pass@1 success rate of $70.0%$ . Beyond advancing benchmark performance, we introduce an end-to-end training framework for developing tool-integrated reasoning models, offering a promising direction for automated theorem proving and Math AI assistant.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsLogic, programming, and type systems · Artificial Intelligence in Games · Robot Manipulation and Learning