Learning Mathematical Rules with Large Language Models

Antoine Gorceix; Bastien Le Chenadec; Ahmad Rammal; Nelson Vadori,; Manuela Veloso

arXiv:2410.16973·cs.CL·October 28, 2024

Learning Mathematical Rules with Large Language Models

Antoine Gorceix, Bastien Le Chenadec, Ahmad Rammal, Nelson Vadori,, Manuela Veloso

PDF

Open Access

TL;DR

This paper investigates how well large language models can learn, generalize, and reuse mathematical rules like distributivity and simplification, using synthetic data and fine-tuning, with promising but limited success.

Contribution

It introduces a rigorous methodology for training large language models on synthetic mathematical rule data and evaluates their ability to generalize and apply these rules.

Findings

01

Models can learn and generalize mathematical rules to some extent.

02

Models can reuse learned rules in solving word problems.

03

Fine-tuning improves models' mathematical reasoning capabilities.

Abstract

In this paper, we study the ability of large language models to learn specific mathematical rules such as distributivity or simplifying equations. We present an empirical analysis of their ability to generalize these rules, as well as to reuse them in the context of word problems. For this purpose, we provide a rigorous methodology to build synthetic data incorporating such rules, and perform fine-tuning of large language models on such data. Our experiments show that our model can learn and generalize these rules to some extent, as well as suitably reuse them in the context of word problems.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Mathematics, Computing, and Information Processing · Handwritten Text Recognition Techniques