LeanGeo: Formalizing Competitional Geometry problems in Lean
Chendong Song, Zihan Wang, Frederick Pu, Haiming Wang, Xiaohan Lin, Junqi Liu, Jia Li, Zhengying Liu

TL;DR
LeanGeo is a formal system built in Lean 4 that unifies the formalization and solving of advanced geometry problems, enabling rigorous proof verification and integration with mathematical libraries.
Contribution
It introduces LeanGeo, a comprehensive formalization framework for geometry problems, and provides a benchmark to evaluate AI reasoning in geometric contexts.
Findings
Demonstrates LeanGeo's effectiveness in formalizing geometry problems.
Evaluates large language models' capabilities on the geometry benchmark.
Highlights current limitations of AI in automated geometric reasoning.
Abstract
Geometry problems are a crucial testbed for AI reasoning capabilities. Most existing geometry solving systems cannot express problems within a unified framework, thus are difficult to integrate with other mathematical fields. Besides, since most geometric proofs rely on intuitive diagrams, verifying geometry problems is particularly challenging. To address these gaps, we introduce LeanGeo, a unified formal system for formalizing and solving competition-level geometry problems within the Lean 4 theorem prover. LeanGeo features a comprehensive library of high-level geometric theorems with Lean's foundational logic, enabling rigorous proof verification and seamless integration with Mathlib. We also present LeanGeo-Bench, a formal geometry benchmark in LeanGeo, comprising problems from the International Mathematical Olympiad (IMO) and other advanced sources. Our evaluation demonstrates the…
Peer Reviews
Decision·Submitted to ICLR 2026
* The LeanGeo library allows formalizing and solving competition-level geometry problems in Lean 4. It includes an extensive library of high-level definitions and tactics, which makes formal proofs more intuitive and understandable. LeanGeo's integration with Mathlib allows it to leverage powerful tools from other areas of maths. * The LeanGeo-Bench benchmark is useful for evaluating advances in the field. * The paper is generally well-written and easy to follow.
I find some of the claims insufficiently supported and some more details would be helpful, as explained below. * The main limitation of LeanEuclid compared to LeanGeo seems to be a limited set of formalized geometry facts, as stated at line 52. Is it difficult to expand LeanEuclid's library? If not, what additional advantage does LeanGeo has? * The paper claims that LeanGeo allows expressing and reasoning about geometry problems in a human-like manner. This may be debatable as the examples prese
- The paper presents a clear structure and provides readable examples; it is well-written, although it has minor wording issues. - The system design is plausible and well-motivated. - It is a valuable step toward Lean-native, competition-level geometry with cross-domain reasoning.
- The experiments are thin and largely non-diagnostic. There are no core ablations—no SMT off/partial settings, no scaling with library size or lemma granularity, no comparisons of tactic schedules, prompts, or decoding—so it’s unclear why models fail. - It’s also unclear how this benchmark compares to others. There’s no side-by-side test on the same problems against existing Lean or geometry benchmarks, so claims about being better or different are mostly qualitative. - Scalability and complexi
LeanGeo provides a novel framework that integrates competition-level geometry into Lean 4, complemented by LeanGeo-Bench, offering a valuable resource for AI reasoning research. The seamless integration with Mathlib enables LeanGeo to leverage algebraic and inequality tools, enhancing its applicability to complex, interdisciplinary mathematical problems.
1. LeanGeo heavily relies on existing frameworks such as LeanEuclid (Murphy et al., 2024) and SystemE (Avigad et al., 2009), with its primary contributions being an expanded theorem library and 52 new abbreviations (syntactic sugar). This incremental approach lacks substantial novelty. Compared to AlphaGeometry (Trinh et al., 2024), which introduces innovative neurosymbolic search, LeanGeo’s “human-like” proofs show limited differentiation. Table 1 highlights qualitative differences but fails to
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBusiness Strategy and Innovation · Manufacturing Process and Optimization · Operations Management Techniques
