Why Agentic Theorem Prover Works: A Statistical Provability Theory of Mathematical Reasoning Models
Sho Sonoda, Shunta Akiyama, Yuya Uezato

TL;DR
This paper introduces a statistical provability framework for agentic theorem provers, explaining their empirical success through a formal, component-sensitive theory based on Markov decision processes and Bellman structure.
Contribution
It formalizes modern theorem-proving pipelines as time-bounded MDPs and derives bounds on their performance, providing a theoretical explanation for their success and limitations.
Findings
Proves existence of optimal policies under mild regularity.
Derives provability certificates via sub-/super-solution inequalities.
Bounds performance gap based on complexity and geometry.
Abstract
Agentic theorem provers -- pipelines that couple a mathematical reasoning model with library retrieval, subgoal-decomposition/search planner, and a proof assistant verifier -- have recently achieved striking empirical success, yet it remains unclear which components drive performance and why such systems work at all despite classical hardness of proof search. We propose a distributional viewpoint and introduce \textbf{statistical provability}, defined as the finite-horizon success probability of reaching a verified proof, averaged over an instance distribution, and formalize modern theorem-proving pipelines as time-bounded MDPs. Exploiting Bellman structure, we prove existence of optimal policies under mild regularity, derive provability certificates via sub-/super-solution inequalities, and bound the performance gap of score-guided planning (greedy/top-\(k\)/beam/rollouts) in terms of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLogic, Reasoning, and Knowledge · Constraint Satisfaction and Optimization · Computability, Logic, AI Algorithms
