Getting More out of Large Language Models for Proofs

Shizhuo Dylan Zhang; Talia Ringer; Emily First

arXiv:2305.04369·cs.FL·June 2, 2023·1 cites

Getting More out of Large Language Models for Proofs

Shizhuo Dylan Zhang, Talia Ringer, Emily First

PDF

Open Access 1 Repo

TL;DR

This paper investigates the limitations of large language models in formal theorem proving, analyzing failure cases to improve their effectiveness and accessibility in proof tasks.

Contribution

It provides an analysis of failure modes of language models in proof tasks and offers insights to enhance their performance in formal theorem proving.

Findings

01

Identified common failure patterns in language models for proofs

02

Insights into prompt design to mitigate failure cases

03

Guidelines for improving model reliability in proof tasks

Abstract

Large language models have the potential to simplify formal theorem proving and make it more accessible. But how to get the most out of these models is still an open question. To answer this question, we take a step back and explore the failure cases of these models using common prompting-based techniques. Our talk will discuss these failure cases and what they can teach us about how to get more out of these models.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dylanzsz/llm4proof
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Mathematics, Computing, and Information Processing · Logic, programming, and type systems