Towards Autonomous Mathematics Research
Tony Feng, Trieu H. Trinh, Garrett Bingham, Dawsen Hwang, Yuri Chervonyi, Junehyuk Jung, Joonkyung Lee, Carlo Pagano, Sang-hyun Kim, Federico Pasqualotto, Sergei Gukov, Jonathan N. Lee, Junsu Kim, Kaiying Hou, Golnaz Ghiasi, Yi Tay, YaGuang Li, Chenkai Kuang, Yuan Liu

TL;DR
This paper introduces Aletheia, an AI system capable of conducting autonomous mathematical research, including generating, verifying, and revising proofs across Olympiad to PhD-level problems, demonstrating significant progress in AI-assisted mathematics.
Contribution
The work presents Aletheia, a novel AI research agent that advances autonomous mathematical research through iterative reasoning, tool use, and human-AI collaboration, surpassing Olympiad-level problems.
Findings
AI generated a research paper on eigenweights without human intervention.
Demonstrated human-AI collaboration in proving bounds on particle systems.
Autonomous solutions found for four open problems in Bloom's Erdos Conjectures database.
Abstract
Recent advances in foundational models have yielded reasoning systems capable of achieving a gold-medal standard at the International Mathematical Olympiad. The transition from competition-level problem-solving to professional research, however, requires navigating vast literature and constructing long-horizon proofs. In this work, we introduce Aletheia, a math research agent that iteratively generates, verifies, and revises solutions end-to-end in natural language. Specifically, Aletheia is powered by an advanced version of Gemini Deep Think for challenging reasoning problems, a novel inference-time scaling law that extends beyond Olympiad-level problems, and intensive tool use to navigate the complexities of mathematical research. We demonstrate the capability of Aletheia from Olympiad problems to PhD-level exercises and most notably, through several distinct milestones in AI-assisted…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
DeepMind’s New AI Just Changed Science Forever· youtube
Taxonomy
TopicsAdvanced Graph Neural Networks · Machine Learning in Materials Science · Mathematics, Computing, and Information Processing
