Bolzano: Case Studies in LLM-Assisted Mathematical Research
Martin Balko, Jan Greb\'ik, Pavel Hub\'a\v{c}ek, Martin Kouteck\'y, Mat\v{e}j Kripner, V\'aclav Rozho\v{n}, Robert \v{S}\'amal, Adri\'an Z\'ame\v{c}n\'ik

TL;DR
This paper demonstrates that an open-source multi-agent LLM system, Bolzano, can autonomously produce publishable mathematical research results across various problems.
Contribution
It introduces Bolzano, a multi-agent LLM framework that autonomously collaborates on mathematical problems, achieving significant research outcomes.
Findings
Six of eight problems reached publishable research level.
Five of eight results were produced essentially autonomously.
Evidence that LLMs can meaningfully contribute to mathematical research.
Abstract
We report new results on eight problems in mathematics and theoretical computer science, produced with the assistance of Bolzano, an open-source multi-agent LLM system. Bolzano orchestrates rounds of interaction between parallel prover agents and a verifier agent while maintaining a persistent knowledge base that is carried across rounds. Classified using the significance-autonomy taxonomy of Feng et al., six of the eight results reach the level of publishable research, and five of the eight were produced essentially autonomously by Bolzano. Our results provide evidence that LLMs can contribute meaningfully to mathematical research, complementing recent reports by Bubeck et al., Woodruff et al., and others.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
