Escaping the Cognitive Well: Efficient Competition Math with Off-the-Shelf Models
Xingyu Dang, Rohit Agarwal, Rodrigo Porto, Anirudh Goyal, Liam H Fowl, Sanjeev Arora

TL;DR
This paper introduces an efficient inference pipeline using off-the-shelf models that achieves state-of-the-art performance on IMO-style math problems at a fraction of the cost of previous methods.
Contribution
The work presents a novel pipeline that overcomes grader failure modes through conjecture extraction and context detachment, enabling high performance with general-purpose models.
Findings
Achieves 67.1% accuracy on IMO-ProofBench Advanced
Reduces inference cost to approximately 31 USD per question
Doubles success rate compared to previous public pipelines
Abstract
In the past year, custom and unreleased math reasoning models reached gold medal performance on the International Mathematical Olympiad (IMO). Similar performance was then reported using large-scale inference on publicly available models but at prohibitive costs (e.g., 3000 USD per problem). In this work, we present an inference pipeline that attains best-in-class performance on IMO-style math problems at an average inference cost orders of magnitude below competing methods while using only general-purpose off-the-shelf models. Our method relies on insights about grader failure in solver-grader pipelines, which we call the Cognitive Well (iterative refinement converging to a wrong solution that the solver as well as the pipeline's internal grader consider to be basically correct). Our pipeline addresses these failure modes through conjecture extraction, wherein candidate lemmas are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Constraint Satisfaction and Optimization · Cognitive and developmental aspects of mathematical skills
