TheoremForge: Scaling up Formal Data Synthesis with Low-Budget Agentic Workflow
Yicheng Tao, Hongteng Xu

TL;DR
TheoremForge introduces a cost-effective, scalable formal data synthesis pipeline that decomposes formalization into five sub-tasks, recovering valuable training signals from failed attempts to improve proof generation efficiency.
Contribution
It presents a novel decoupled extraction strategy and a five-step workflow that significantly reduces costs and increases data yield for formal mathematics data synthesis.
Findings
Achieved a 12.6% verified rate on 2,000 problems, surpassing the baseline of 8.6%.
Reduced average cost to $0.481 per successful trajectory.
Increased proof generation data yield by 1.6 times.
Abstract
The high cost of agentic workflows in formal mathematics hinders large-scale data synthesis, exacerbating the scarcity of open-source corpora. To address this, we introduce \textbf{TheoremForge}, a cost-effective formal data synthesis pipeline that decomposes the formalization process into five sub-tasks, which are \textit{statement formalization}, \textit{proof generation}, \textit{premise selection}, \textit{proof correction} and \textit{proof sketching}. By implementing a \textit{Decoupled Extraction Strategy}, the workflow recovers valid training signals from globally failed trajectories, effectively utilizing wasted computation. Experiments on a 2,000-problem benchmark demonstrate that TheoremForge achieves a Verified Rate of 12.6\%, surpassing the 8.6\% baseline, at an average cost of only \textbf{$0.481} per successful trajectory using Gemini-3-Flash. Crucially, our strategy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLogic, programming, and type systems · Scientific Computing and Data Management · Machine Learning in Materials Science
