Spark-Prover-X1: Formal Theorem Proving Through Diverse Data Training

Xinyuan Zhou; Yi Lei; Xiaoyu Zhou; Jingyi Sun; Yu Zhu; Zhongyi Ye; Weitai Zhang; Quan Liu; Si Wei; Cong Liu

arXiv:2511.13043·cs.CL·December 2, 2025

Spark-Prover-X1: Formal Theorem Proving Through Diverse Data Training

Xinyuan Zhou, Yi Lei, Xiaoyu Zhou, Jingyi Sun, Yu Zhu, Zhongyi Ye, Weitai Zhang, Quan Liu, Si Wei, Cong Liu

PDF

Open Access

TL;DR

Spark-Prover-X1 is a 7B parameter language model trained through a multi-stage process, including broad mathematical pre-training, specialized fine-tuning, and policy optimization, achieving state-of-the-art results in formal theorem proving on challenging benchmarks.

Contribution

The paper introduces a novel multi-stage training framework for lightweight LLMs, combining diverse data, a new reasoning task, and policy optimization to enhance formal reasoning capabilities.

Findings

01

Achieves state-of-the-art performance among similar-sized models.

02

Solves 27 problems on PutnamBench (pass@32).

03

Scores 24.0% on CombiBench (pass@32).

Abstract

Large Language Models (LLMs) have shown significant promise in automated theorem proving, yet progress is often constrained by the scarcity of diverse and high-quality formal language data. To address this issue, we introduce Spark-Prover-X1, a 7B parameter model trained via an three-stage framework designed to unlock the reasoning potential of more accessible and moderately-sized LLMs. The first stage infuses deep knowledge through continuous pre-training on a broad mathematical corpus, enhanced by a suite of novel data tasks. Key innovation is a "CoT-augmented state prediction" task to achieve fine-grained reasoning. The second stage employs Supervised Fine-tuning (SFT) within an expert iteration loop to specialize both the Spark-Prover-X1-7B and Spark-Formalizer-X1-7B models. Finally, a targeted round of Group Relative Policy Optimization (GRPO) is applied to sharpen the prover's…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Machine Learning in Materials Science · Mathematics, Computing, and Information Processing