Arrows of Math Reasoning Data Synthesis for Large Language Models: Diversity, Complexity and Correctness

Sirui Chen; Changxin Tian; Binbin Hu; Kunlong Chen; Ziqi Liu; Zhiqiang Zhang; Jun Zhou

arXiv:2508.18824·cs.CL·August 27, 2025

Arrows of Math Reasoning Data Synthesis for Large Language Models: Diversity, Complexity and Correctness

Sirui Chen, Changxin Tian, Binbin Hu, Kunlong Chen, Ziqi Liu, Zhiqiang Zhang, Jun Zhou

PDF

TL;DR

This paper introduces a program-assisted data synthesis framework for large language models that produces diverse, complex, and correct mathematical reasoning data, significantly enhancing model performance on benchmarks.

Contribution

The paper presents a novel, scalable synthesis method integrating mathematical tools and validation mechanisms to generate high-quality mathematical reasoning data for LLMs.

Findings

01

Generated 12.3 million problem-solution triples.

02

Models fine-tuned on this data achieve state-of-the-art results.

03

Framework ensures diversity, complexity, and correctness of data.

Abstract

Enhancing the mathematical reasoning of large language models (LLMs) demands high-quality training data, yet conventional methods face critical challenges in scalability, cost, and data reliability. To address these limitations, we propose a novel program-assisted synthesis framework that systematically generates a high-quality mathematical corpus with guaranteed diversity, complexity, and correctness. This framework integrates mathematical knowledge systems and domain-specific tools to create executable programs. These programs are then translated into natural language problem-solution pairs and vetted by a bilateral validation mechanism that verifies solution correctness against program outputs and ensures program-problem consistency. We have generated 12.3 million such problem-solving triples. Experiments demonstrate that models fine-tuned on our data significantly improve their…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.