Solving Inequality Proofs with Large Language Models

Pan Lu; Jiayi Sheng; Luna Lyu; Jikai Jin; Tony Xia; Alex Gu; James Zou

arXiv:2506.07927·cs.AI·December 16, 2025

Solving Inequality Proofs with Large Language Models

Pan Lu, Jiayi Sheng, Luna Lyu, Jikai Jin, Tony Xia, Alex Gu, James Zou

PDF

Open Access 2 Repos 2 Datasets 1 Video

TL;DR

This paper investigates the capability of large language models to prove inequalities, introducing a new dataset, IneqMath, and an evaluation framework that reveals current models struggle with rigorous proofs, highlighting areas for future research.

Contribution

The authors present IneqMath, a curated dataset of Olympiad inequalities, and develop a novel LLM evaluation framework, exposing significant gaps in models' proof reasoning abilities.

Findings

01

Top models achieve less than 10% accuracy on proofs.

02

Accuracy drops up to 65.5% when considering reasoning steps.

03

Scaling models and computation offers limited improvements.

Abstract

Inequality proving, crucial across diverse scientific and mathematical fields, tests advanced reasoning skills such as discovering tight bounds and strategic theorem application. This makes it a distinct, demanding frontier for large language models (LLMs), offering insights beyond general mathematical problem-solving. Progress in this area is hampered by existing datasets that are often scarce, synthetic, or rigidly formal. We address this by proposing an informal yet verifiable task formulation, recasting inequality proving into two automatically checkable subtasks: bound estimation and relation prediction. Building on this, we release IneqMath, an expert-curated dataset of Olympiad-level inequalities, including a test set and training corpus enriched with step-wise solutions and theorem annotations. We also develop a novel LLM-as-judge evaluation framework, combining a final-answer…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Datasets

Videos

Solving Inequality Proofs with Large Language Models· slideslive

Taxonomy

TopicsMachine Learning in Materials Science · Mathematics, Computing, and Information Processing · Topic Modeling

MethodsSparse Evolutionary Training