Large Language Models' Understanding of Math: Source Criticism and   Extrapolation

Roozbeh Yousefzadeh; Xuenan Cao

arXiv:2311.07618·cs.LG·November 15, 2023·1 cites

Large Language Models' Understanding of Math: Source Criticism and Extrapolation

Roozbeh Yousefzadeh, Xuenan Cao

PDF

Open Access

TL;DR

This paper critically examines GPT-4's mathematical understanding, finding it primarily reproduces seen proofs rather than truly grasping concepts, and questions the value of its current approach for theorem proving.

Contribution

The study provides a critical evaluation of GPT-4's mathematical capabilities, highlighting its limitations in understanding and reasoning beyond reproducing seen proofs.

Findings

01

GPT-4 struggles with problems lacking known formal proofs.

02

GPT-4's theorem proving ability appears to expand over time.

03

Reproducing seen proofs is its main strength, not understanding.

Abstract

It has been suggested that large language models such as GPT-4 have acquired some form of understanding beyond the correlations among the words in text including some understanding of mathematics as well. Here, we perform a critical inquiry into this claim by evaluating the mathematical understanding of the GPT-4 model. Considering that GPT-4's training set is a secret, it is not straightforward to evaluate whether the model's correct answers are based on a mathematical understanding or based on replication of proofs that the model has seen before. We specifically craft mathematical questions which their formal proofs are not readily available on the web, proofs that are more likely not seen by the GPT-4. We see that GPT-4 is unable to solve those problems despite their simplicity. It is hard to find scientific evidence suggesting that GPT-4 has acquired an understanding of even basic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Mathematics, Computing, and Information Processing

MethodsSparse Evolutionary Training · Multi-Head Attention · Attention Is All You Need · Linear Layer · Residual Connection · Byte Pair Encoding · Dropout · Adam · Softmax · Label Smoothing