Evaluating the Reasoning Abilities of LLMs on Underrepresented Mathematics Competition Problems

Samuel Golladay; Majid Bani-Yaghoub

arXiv:2512.24505·cs.AI·January 1, 2026

Evaluating the Reasoning Abilities of LLMs on Underrepresented Mathematics Competition Problems

Samuel Golladay, Majid Bani-Yaghoub

PDF

Open Access

TL;DR

This study assesses the reasoning abilities of three leading LLMs on underrepresented mathematics competition problems, revealing strengths in calculus and weaknesses in geometry, with distinct error patterns across models.

Contribution

It introduces an evaluation of LLM performance on underrepresented math problems, highlighting specific error types and model differences in reasoning and accuracy.

Findings

01

DeepSeek-V3 outperforms other models across categories.

02

All models show weak performance in Geometry.

03

Different models exhibit distinct error patterns.

Abstract

Understanding the limitations of Large Language Models, or LLMs, in mathematical reasoning has been the focus of several recent studies. However, the majority of these studies use the same datasets for benchmarking, which limits the generalizability of their findings and may not fully capture the diverse challenges present in mathematical tasks. The purpose of the present study is to analyze the performance of LLMs on underrepresented mathematics competition problems. We prompted three leading LLMs, namely GPT-4o-mini, Gemini-2.0-Flash, and DeepSeek-V3, with the Missouri Collegiate Mathematics Competition problems in the areas of Calculus, Analytic Geometry, and Discrete Mathematics. The LLMs responses were then compared to the known correct solutions in order to determine the accuracy of the LLM for each problem domain. We also analyzed the LLMs reasoning to explore patterns in errors…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMathematics, Computing, and Information Processing · Mathematics Education and Teaching Techniques · Intelligent Tutoring Systems and Adaptive Learning