DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning
Zhihong Shao, Yuxiang Luo, Chengda Lu, Z.Z. Ren, Jiewen Hu, Tian Ye, Zhibin Gou, Shirong Ma, Xiaokang Zhang

TL;DR
DeepSeekMath-V2 advances mathematical reasoning in large language models by integrating self-verification and proof generation, achieving top scores on major math competitions and emphasizing rigorous, step-by-step proof validation.
Contribution
It introduces a self-verifiable reasoning framework with a trained verifier and proof generator, enhancing accuracy and rigor in mathematical theorem proving.
Findings
Achieved gold-level scores on IMO 2025 and CMO 2024.
Scored 118/120 on Putnam 2024 with scaled test-time compute.
Demonstrated the effectiveness of self-verification in mathematical reasoning.
Abstract
Large language models have made significant progress in mathematical reasoning, which serves as an important testbed for AI and could impact scientific research if further advanced. By scaling reasoning with reinforcement learning that rewards correct final answers, LLMs have improved from poor performance to saturating quantitative reasoning competitions like AIME and HMMT in one year. However, this approach faces fundamental limitations. Pursuing higher final answer accuracy doesn't address a key issue: correct answers don't guarantee correct reasoning. Moreover, many mathematical tasks like theorem proving require rigorous step-by-step derivation rather than numerical answers, making final answer rewards inapplicable. To push the limits of deep reasoning, we believe it is necessary to verify the comprehensiveness and rigor of mathematical reasoning. Self-verification is particularly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Topic Modeling · Mathematics, Computing, and Information Processing
