Autoformalizing Euclidean Geometry
Logan Murphy, Kaiyu Yang, Jialiang Sun, Zhaoyu Li, Anima Anandkumar,, Xujie Si

TL;DR
This paper presents a neuro-symbolic framework combining domain knowledge, SMT solvers, and large language models to automatically formalize Euclidean geometry, addressing challenges posed by diagrammatic reasoning and providing a new benchmark dataset.
Contribution
It introduces LeanEuclid, a novel autoformalization benchmark for Euclidean geometry, and demonstrates how LLMs can be used with theorem provers to formalize geometry problems.
Findings
GPT-4 and GPT-4V show promising capabilities in autoformalizing geometry problems.
The framework effectively fills diagrammatic gaps using theorem provers, simplifying LLM tasks.
The dataset and code are publicly available for further research.
Abstract
Autoformalization involves automatically translating informal math into formal theorems and proofs that are machine-verifiable. Euclidean geometry provides an interesting and controllable domain for studying autoformalization. In this paper, we introduce a neuro-symbolic framework for autoformalizing Euclidean geometry, which combines domain knowledge, SMT solvers, and large language models (LLMs). One challenge in Euclidean geometry is that informal proofs rely on diagrams, leaving gaps in texts that are hard to formalize. To address this issue, we use theorem provers to fill in such diagrammatic information automatically, so that the LLM only needs to autoformalize the explicit textual steps, making it easier for the model. We also provide automatic semantic evaluation for autoformalized theorem statements. We construct LeanEuclid, an autoformalization benchmark consisting of problems…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsManufacturing Process and Optimization · Advanced Numerical Analysis Techniques · Robotic Mechanisms and Dynamics
MethodsAttention Is All You Need · Linear Layer · Byte Pair Encoding · Label Smoothing · Adam · Residual Connection · Position-Wise Feed-Forward Layer · Multi-Head Attention · Dropout · Dense Connections
