LeanDojo: Theorem Proving with Retrieval-Augmented Language Models

Kaiyu Yang; Aidan M. Swope; Alex Gu; Rahul Chalamala; Peiyang Song,; Shixing Yu; Saad Godil; Ryan Prenger; Anima Anandkumar

arXiv:2306.15626·cs.LG·October 30, 2023·38 cites

LeanDojo: Theorem Proving with Retrieval-Augmented Language Models

Kaiyu Yang, Aidan M. Swope, Alex Gu, Rahul Chalamala, Peiyang Song,, Shixing Yu, Saad Godil, Ryan Prenger, Anima Anandkumar

PDF

Open Access 3 Repos 2 Models 1 Datasets

TL;DR

LeanDojo introduces an open-source framework with data, models, and benchmarks for theorem proving using retrieval-augmented language models, significantly advancing research accessibility and effectiveness in formal theorem proving.

Contribution

It provides the first open-source toolkit and benchmark for LLM-based theorem proving, including a retrieval-augmented prover trained with minimal resources.

Findings

01

ReProver outperforms non-retrieval baselines and GPT-4.

02

The benchmark challenges generalization to novel premises.

03

LeanDojo enables reproducible research in formal theorem proving.

Abstract

Large language models (LLMs) have shown promise in proving formal theorems using proof assistants such as Lean. However, existing methods are difficult to reproduce or build on, due to private code, data, and large compute requirements. This has created substantial barriers to research on machine learning methods for theorem proving. This paper removes these barriers by introducing LeanDojo: an open-source Lean playground consisting of toolkits, data, models, and benchmarks. LeanDojo extracts data from Lean and enables interaction with the proof environment programmatically. It contains fine-grained annotations of premises in proofs, providing valuable data for premise selection: a key bottleneck in theorem proving. Using this data, we develop ReProver (Retrieval-Augmented Prover): an LLM-based prover augmented with retrieval for selecting premises from a vast math library. It is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Datasets

tasksource/leandojo
dataset· 133 dl
133 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMathematics, Computing, and Information Processing · Natural Language Processing Techniques · Logic, programming, and type systems

MethodsMulti-Head Attention · Attention Is All You Need · Dense Connections · Dropout · Byte Pair Encoding · Softmax · Layer Normalization · Position-Wise Feed-Forward Layer · Linear Layer · Absolute Position Encodings