Learning Mathematical Rules with Large Language Models
Antoine Gorceix, Bastien Le Chenadec, Ahmad Rammal, Nelson Vadori,, Manuela Veloso

TL;DR
This paper investigates how well large language models can learn, generalize, and reuse mathematical rules like distributivity and simplification, using synthetic data and fine-tuning, with promising but limited success.
Contribution
It introduces a rigorous methodology for training large language models on synthetic mathematical rule data and evaluates their ability to generalize and apply these rules.
Findings
Models can learn and generalize mathematical rules to some extent.
Models can reuse learned rules in solving word problems.
Fine-tuning improves models' mathematical reasoning capabilities.
Abstract
In this paper, we study the ability of large language models to learn specific mathematical rules such as distributivity or simplifying equations. We present an empirical analysis of their ability to generalize these rules, as well as to reuse them in the context of word problems. For this purpose, we provide a rigorous methodology to build synthetic data incorporating such rules, and perform fine-tuning of large language models on such data. Our experiments show that our model can learn and generalize these rules to some extent, as well as suitably reuse them in the context of word problems.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Mathematics, Computing, and Information Processing · Handwritten Text Recognition Techniques
