M2F: Automated Formalization of Mathematical Literature at Scale
Zichen Wang, Wanli Ma, Zhenyu Ming, Gong Zhang, Kun Yuan, and Zaiwen Wen

TL;DR
M2F is a novel framework that automates the end-to-end formalization of large mathematical texts into Lean, significantly reducing the time and effort needed for project-scale formalization.
Contribution
It introduces M2F, the first agentic framework capable of scaling formalization to entire textbooks by managing dependencies, repairs, and proof completion automatically.
Findings
Formalized 479 pages of textbooks into Lean in three weeks.
Achieved 96% proof success rate, outperforming baseline methods.
Demonstrated practical large-scale formalization of mathematical literature.
Abstract
Automated formalization of mathematics enables mechanical verification but remains limited to isolated theorems and short snippets. Scaling to textbooks and research papers is largely unaddressed, as it requires managing cross-file dependencies, resolving imports, and ensuring that entire projects compile end-to-end. We present M2F (Math-to-Formal), the first agentic framework for end-to-end, project-scale autoformalization in Lean. The framework operates in two stages. The statement compilation stage splits the document into atomic blocks, orders them via inferred dependencies, and repairs declaration skeletons until the project compiles, allowing placeholders in proofs. The proof repair stage closes these holes under fixed signatures using goal-conditioned local edits. Throughout both stages, M2F keeps the verifier in the loop, committing edits only when toolchain feedback confirms…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMathematics, Computing, and Information Processing · Logic, programming, and type systems · Polynomial and algebraic computation
