Experimental results from applying GPT-4 to an unpublished formal language
Gregor vom Scheidt

TL;DR
This paper demonstrates that GPT-4 can effectively understand, generate, and verify formal language tasks, including theorem proving and syntax invention, showcasing its potential in mathematical and formal reasoning domains.
Contribution
It provides the first evidence that GPT-4 can handle complex formal language tasks, including theorem proving and syntax creation, in an unpublished formal system.
Findings
GPT-4 completed all tasks successfully
Exhibited extensive domain knowledge and inference abilities
Invented new syntax and semantics
Abstract
Can large language models be used to complete mathematical tasks that are traditionally performed either manually or with the aid of theorem provers? To answer this question, a state-of-the-art system, GPT-4, was provided with a concise natural language specification for a previously unpublished formal system and asked to complete a number of tasks, from stating function and type definitions to proving simple theorems and verifying user-supplied proofs. The system completed all tasks successfully, showed extensive domain knowledge, invented helpful new syntax and semantics, and exhibited generalization and inference abilities. So the answer seems to be: yes.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMathematics, Computing, and Information Processing · Natural Language Processing Techniques · Topic Modeling
MethodsAttention Is All You Need · Absolute Position Encodings · Softmax · Layer Normalization · Byte Pair Encoding · Dropout · Linear Layer · Label Smoothing · Multi-Head Attention · Adam
