Experimental results from applying GPT-4 to an unpublished formal   language

Gregor vom Scheidt

arXiv:2305.12196·cs.CL·May 23, 2023·1 cites

Experimental results from applying GPT-4 to an unpublished formal language

Gregor vom Scheidt

PDF

Open Access

TL;DR

This paper demonstrates that GPT-4 can effectively understand, generate, and verify formal language tasks, including theorem proving and syntax invention, showcasing its potential in mathematical and formal reasoning domains.

Contribution

It provides the first evidence that GPT-4 can handle complex formal language tasks, including theorem proving and syntax creation, in an unpublished formal system.

Findings

01

GPT-4 completed all tasks successfully

02

Exhibited extensive domain knowledge and inference abilities

03

Invented new syntax and semantics

Abstract

Can large language models be used to complete mathematical tasks that are traditionally performed either manually or with the aid of theorem provers? To answer this question, a state-of-the-art system, GPT-4, was provided with a concise natural language specification for a previously unpublished formal system and asked to complete a number of tasks, from stating function and type definitions to proving simple theorems and verifying user-supplied proofs. The system completed all tasks successfully, showed extensive domain knowledge, invented helpful new syntax and semantics, and exhibited generalization and inference abilities. So the answer seems to be: yes.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMathematics, Computing, and Information Processing · Natural Language Processing Techniques · Topic Modeling

MethodsAttention Is All You Need · Absolute Position Encodings · Softmax · Layer Normalization · Byte Pair Encoding · Dropout · Linear Layer · Label Smoothing · Multi-Head Attention · Adam