First Proof
Mohammed Abouzaid, Andrew J. Blumberg, Martin Hairer, Joe Kileel, Tamara G. Kolda, Paul D. Nelson, Daniel Spielman, Nikhil Srivastava, Rachel Ward, Shmuel Weinberger, and Lauren Williams

TL;DR
This paper introduces a set of ten research-level mathematics questions to evaluate current AI systems' problem-solving capabilities, providing a new benchmark for assessing AI performance in advanced mathematics.
Contribution
It presents a novel, publicly shared set of research-level math questions that serve as a benchmark for testing AI systems' mathematical reasoning.
Findings
Questions are challenging for current AI systems
Provides a new benchmark for AI mathematical reasoning
Encourages development of more advanced AI in mathematics
Abstract
To assess the ability of current AI systems to correctly answer research-level mathematics questions, we share a set of ten math questions which have arisen naturally in the research process of the authors. The questions had not been shared publicly until now; the answers are known to the authors of the questions but will remain encrypted for a short time.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMathematics, Computing, and Information Processing · History and Theory of Mathematics · Computability, Logic, AI Algorithms
