Semi-Autonomous Mathematics Discovery with Gemini: A Case Study on the Erd\H{o}s Problems
Tony Feng, Trieu Trinh, Garrett Bingham, Jiwon Kang, Shengtong Zhang, Sang-hyun Kim, Kevin Barreto, Carl Schildkraut, Junehyuk Jung, Jaehyeon Seo, Carlo Pagano, Yuri Chervonyi, Dawsen Hwang, Kaiying Hou, Sergei Gukov, Cheng-Chiang Tsai, Hyunwoo Choi, Youngbeom Jin, Wei-Yuan Li

TL;DR
This paper demonstrates semi-autonomous mathematical discovery using AI and human expertise to evaluate 700 conjectures from Erdős Problems, revealing insights into problem openness and AI's role in research.
Contribution
It introduces a hybrid AI-human methodology for evaluating open conjectures and provides a case study on Erdős Problems, highlighting AI's potential and challenges in mathematical discovery.
Findings
5 conjectures solved with seemingly novel autonomous methods
8 conjectures identified as previously solved in literature
Problems marked 'Open' often due to obscurity, not difficulty
Abstract
We present a case study in semi-autonomous mathematics discovery, using Gemini to systematically evaluate 700 conjectures labeled 'Open' in Bloom's Erd\H{o}s Problems database. We employ a hybrid methodology: AI-driven natural language verification to narrow the search space, followed by human expert evaluation to gauge correctness and novelty. We address 13 problems that were marked 'Open' in the database: 5 through seemingly novel autonomous solutions, and 8 through identification of previous solutions in the existing literature. Our findings suggest that the 'Open' status of the problems was through obscurity rather than difficulty. We also identify and discuss issues arising in applying AI to math conjectures at scale, highlighting the difficulty of literature identification and the risk of ''subconscious plagiarism'' by AI. We reflect on the takeaways from AI-assisted efforts on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMathematics, Computing, and Information Processing · Benford’s Law and Fraud Detection · Machine Learning and Algorithms
