TL;DR
STAR-PólyaMath is a multi-agent reasoning framework that uses meta-level supervision and structured interaction to improve mathematical problem-solving over extended reasoning chains, achieving state-of-the-art results.
Contribution
It introduces a persistent Meta-Strategist and an orchestrated state machine to address reliability issues in long-horizon reasoning tasks, surpassing existing benchmarks.
Findings
Achieves perfect scores on AIMEs, Putnam, and HMMT.
Outperforms baseline GPT-5.5 on Apex 2025 with 93.75% accuracy.
Demonstrates that framework orchestration improves performance independently of model diversity.
Abstract
Frontier AI models and multi-agent systems have led to significant improvements in mathematical reasoning. However, for problems requiring extended, long-horizon reasoning, existing systems continue to suffer from fundamental reliability issues: hallucination accumulation, memory fragmentation, and imbalanced reasoning-tool trade-offs. In this paper, we introduce STAR-P\'olyaMath, a multi-agent framework that systematically addresses these challenges through meta-level supervision and structured Reasoner-Verifier interaction. STAR-P\'olyaMath is structured as an orchestrated state machine with nested challenge-step-replan loops, governed by a reasoning-free Python orchestrator that separates control from inference and bounds error propagation through trace-back and re-planning. Our key innovation is a persistent Meta-Strategist that maintains cross-attempt memory and exercises…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
