Proof Automation with Large Language Models
Minghai Lu, Benjamin Delaware, Tianyi Zhang

TL;DR
This paper introduces PALM, a generate-then-repair method that enhances LLM-based formal proof generation, significantly increasing success rates and proving theorems beyond previous capabilities.
Contribution
The paper presents PALM, a novel approach combining LLMs with symbolic repair techniques to improve formal proof automation in interactive theorem provers.
Findings
PALM outperforms existing methods in theorem proving success rates.
PALM proves 1270 theorems beyond current approaches.
The approach is effective across different LLMs.
Abstract
Interactive theorem provers such as Coq are powerful tools to formally guarantee the correctness of software. However, using these tools requires significant manual effort and expertise. While Large Language Models (LLMs) have shown promise in automatically generating informal proofs in natural language, they are less effective at generating formal proofs in interactive theorem provers. In this paper, we conduct a formative study to identify common mistakes made by LLMs when asked to generate formal proofs. By analyzing 520 proof generation errors made by GPT-3.5, we found that GPT-3.5 often identified the correct high-level structure of a proof, but struggled to get the lower-level details correct. Based on this insight, we propose PALM, a novel generate-then-repair approach that first prompts an LLM to generate an initial proof and then leverages targeted symbolic methods to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Natural Language Processing Techniques · Mathematics, Computing, and Information Processing
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Cosine Annealing · {Dispute@FaQ-s}How to file a dispute with Expedia? · Linear Layer · Weight Decay · Linear Warmup With Cosine Annealing · Byte Pair Encoding · Softmax
