Learning Beyond Pattern Matching? Assaying Mathematical Understanding in LLMs
Siyuan Guo, Aniket Didolkar, Nan Rosemary Ke, Anirudh Goyal, Ferenc, Husz\'ar, Bernhard Sch\"olkopf

TL;DR
This paper evaluates the mathematical understanding of large language models (LLMs) by analyzing their learning processes and knowledge structures, revealing insights into their domain comprehension and limitations.
Contribution
It introduces NTKEval, a novel assessment method inspired by Neural Tangent Kernel, to analyze how LLMs learn mathematical skills during training and in-context learning.
Findings
Evidence of domain understanding during in-context learning
Instruction-tuning shows performance changes without domain understanding
Highlights differences in learning mechanisms for mathematical skills
Abstract
We are beginning to see progress in language model assisted scientific discovery. Motivated by the use of LLMs as a general scientific assistant, this paper assesses the domain knowledge of LLMs through its understanding of different mathematical skills required to solve problems. In particular, we look at not just what the pre-trained model already knows, but how it learned to learn from information during in-context learning or instruction-tuning through exploiting the complex knowledge structure within mathematics. Motivated by the Neural Tangent Kernel (NTK), we propose \textit{NTKEval} to assess changes in LLM's probability distribution via training on different kinds of math data. Our systematic analysis finds evidence of domain understanding during in-context learning. By contrast, certain instruction-tuning leads to similar performance changes irrespective of training on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOpen Education and E-Learning · Mathematics, Computing, and Information Processing
