InternLM2.5-StepProver: Advancing Automated Theorem Proving via Critic-Guided Search

Zijian Wu; Suozhi Huang; Zhejian Zhou; Huaiyuan Ying; Zheng Yuan; Wenwei Zhang; Dahua Lin; Kai Chen

arXiv:2410.15700·cs.AI·October 22, 2025

InternLM2.5-StepProver: Advancing Automated Theorem Proving via Critic-Guided Search

Zijian Wu, Suozhi Huang, Zhejian Zhou, Huaiyuan Ying, Zheng Yuan, Wenwei Zhang, Dahua Lin, Kai Chen

PDF

Open Access 1 Repo 2 Models

TL;DR

This paper introduces InternLM2.5-StepProver, a novel approach that employs a critic-guided search method to enhance automated theorem proving with large language models, leading to significant performance improvements.

Contribution

It proposes a critic-guided search framework for LLM-based theorem proving and demonstrates substantial performance gains through large-scale expert iteration.

Findings

01

Proposed critic-guided search improves proof success rate from 59.4% to 65.9%.

02

Large-scale expert iteration with 20,000 CPU days enhances model performance.

03

Open-sourced models and proof data for community use.

Abstract

Large Language Models (LLMs) have emerged as powerful tools in mathematical theorem proving, particularly when utilizing formal languages such as LEAN. A prevalent proof method involves the LLM prover iteratively constructing the proof tactic by tactic, typically following a best-first search scheme. However, this method often ignores the critical preference information inside the existing tactic trajectories, hindering the search for deeper proofs. We propose an intuitive yet effective method, which utilizes a critic model to capture the preference information and to guide the search of the prover model at runtime. Given the prover-critic framework, a large-scale expert iteration with more than 20,000 CPU days is then applied to further fine-tune the prover and the critic. The trained InternLM2.5-StepProver critic significantly boosts the performance of the prover model (59.4% to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

internlm/internlm-math
pytorchOfficial

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · AI-based Problem Solving and Planning