A Benchmark for Math Misconceptions: Bridging Gaps in Middle School Algebra with AI-Supported Instruction
Otero Nancy, Druga Stefania, Lan Andrew

TL;DR
This paper presents a benchmark dataset of algebra misconceptions for AI educational tools, demonstrating its potential to diagnose student errors and support teacher training through AI integration.
Contribution
It introduces a new dataset of algebra misconceptions and evaluates AI models' effectiveness in identifying these misconceptions, highlighting the importance of human-AI collaboration.
Findings
LLMs achieved up to 83.9% precision and recall on misconceptions
Topics like ratios and proportions are particularly challenging for AI
Most educators see value in AI tools for diagnosing misconceptions
Abstract
This study introduces an evaluation benchmark for middle school algebra to be used in artificial intelligence(AI) based educational platforms. The goal is to support the design of AI systems that can enhance learner conceptual understanding of algebra by taking into account their current level of algebra comprehension. The data set comprises 55 misconceptions about algebra, common errors, and 220 diagnostic examples identified in previous peer-reviewed studies. We provide an example application using a large language model, observing a range of precision and recall scores depending on the topic and experimental setup that reaches 83.9% when including educator feedback and restricting it by topic. We found that topics such as ratios and proportions prove as difficult for LLMs as they are for students. We included a human assessment of LLMs results and feedback from five middle school…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOnline Learning and Analytics · Artificial Intelligence in Education · Teaching and Learning Programming
