Autograding Mathematical Induction Proofs with Natural Language Processing

Chenyan Zhao; Mariana Silva; and Seth Poulsen

arXiv:2406.10268·cs.AI·July 15, 2025·2 cites

Autograding Mathematical Induction Proofs with Natural Language Processing

Chenyan Zhao, Mariana Silva, and Seth Poulsen

PDF

Open Access

TL;DR

This paper develops and evaluates large language models for automatically grading mathematical induction proofs, demonstrating that AI can provide effective feedback and improve student learning, surpassing some human graders in accuracy.

Contribution

The paper introduces novel training methods for large language models to autograde mathematical proofs, and shows their effectiveness compared to human graders.

Findings

01

Models achieve satisfactory grading accuracy.

02

AI autograder outperforms most human graders.

03

Students improve their proofs with AI feedback.

Abstract

In mathematical proof education, there remains a need for interventions that help students learn to write mathematical proofs. Research has shown that timely feedback can be very helpful to students learning new skills. While for many years natural language processing models have struggled to perform well on tasks related to mathematical texts, recent developments in natural language processing have created the opportunity to complete the task of giving students instant feedback on their mathematical proofs. In this paper, we present a set of training methods and models capable of autograding freeform mathematical proofs by leveraging existing large language models and other machine learning techniques. The models are trained using proof data collected from four different proof by induction problems. We use four different robust large language models to compare their performances, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMathematics, Computing, and Information Processing

MethodsSparse Evolutionary Training