TheoremLlama: Transforming General-Purpose LLMs into Lean4 Experts

Ruida Wang; Jipeng Zhang; Yizhen Jia; Rui Pan; Shizhe Diao; Renjie Pi,; Tong Zhang

arXiv:2407.03203·cs.FL·October 7, 2024

TheoremLlama: Transforming General-Purpose LLMs into Lean4 Experts

Ruida Wang, Jipeng Zhang, Yizhen Jia, Rui Pan, Shizhe Diao, Renjie Pi,, Tong Zhang

PDF

Open Access 1 Repo 1 Models 1 Datasets 1 Video

TL;DR

TheoremLlama is a new framework that trains general-purpose large language models to become experts in Lean4 theorem proving by generating aligned datasets and employing specialized training techniques, significantly improving formal proof accuracy.

Contribution

It introduces an end-to-end training framework with novel NL-FL dataset generation, bootstrapping, curriculum learning, and iterative proof writing for formal theorem proving.

Findings

01

Achieves higher accuracy than GPT-4 on MiniF2F datasets.

02

Provides an aligned NL-FL dataset called Open Bootstrapped Theorems.

03

Demonstrates the effectiveness of bootstrapping and curriculum learning in formal proof generation.

Abstract

Proving mathematical theorems using computer-verifiable formal languages like Lean significantly impacts mathematical reasoning. One approach to formal theorem proving involves generating complete proofs using Large Language Models (LLMs) based on Natural Language (NL) proofs. However, due to the scarcity of aligned NL and Formal Language (FL) theorem-proving data most modern LLMs exhibit suboptimal performance.This scarcity results in a paucity of methodologies for training LLMs and techniques to fully utilize their capabilities in composing formal proofs. To address these challenges, this paper proposes TheoremLlama, an end-to-end framework that trains a general-purpose LLM to be a Lean4 expert. TheoremLlama includes NL-FL dataset generation and bootstrapping method to obtain aligned dataset, curriculum learning and block training techniques to train the model, and iterative proof…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

RickySkywalker/TheoremLlama
pytorchOfficial

Models

🤗
RickyDeSkywalker/TheoremLlama
model· 1 dl· ♡ 7
1 dl♡ 7

Datasets

RickyDeSkywalker/OpenBootstrappedTheorem
dataset· 7 dl
7 dl

Videos

TheoremLlama: Transforming General-Purpose LLMs into Lean4 Experts· underline

Taxonomy

TopicsSemantic Web and Ontologies · Quality and Management Systems · Engineering Technology and Methodologies

MethodsAttention Is All You Need · Linear Layer · Multi-Head Attention · Softmax · Byte Pair Encoding · Layer Normalization · Label Smoothing · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Adam