MathScale: Scaling Instruction Tuning for Mathematical Reasoning

Zhengyang Tang; Xingxing Zhang; Benyou Wang; Furu Wei

arXiv:2403.02884·cs.CL·March 6, 2024·2 cites

MathScale: Scaling Instruction Tuning for Mathematical Reasoning

Zhengyang Tang, Xingxing Zhang, Benyou Wang, Furu Wei

PDF

Open Access 1 Repo 10 Models 5 Datasets

TL;DR

MathScale introduces a scalable method to generate extensive mathematical reasoning datasets using frontier LLMs, significantly enhancing the mathematical problem-solving abilities of open-source models like LLaMA-2 and Mistral.

Contribution

We propose MathScale, a novel approach to create large-scale math reasoning data and demonstrate its effectiveness in improving LLMs' mathematical capabilities.

Findings

01

Created MathScaleQA with 2 million math QA pairs.

02

Achieved state-of-the-art performance on MwpBench with MathScale-7B.

03

Significant accuracy improvements over peers of similar size.

Abstract

Large language models (LLMs) have demonstrated remarkable capabilities in problem-solving. However, their proficiency in solving mathematical problems remains inadequate. We propose MathScale, a simple and scalable method to create high-quality mathematical reasoning data using frontier LLMs (e.g., {\tt GPT-3.5}). Inspired by the cognitive mechanism in human mathematical learning, it first extracts topics and knowledge points from seed math questions and then build a concept graph, which is subsequently used to generate new math questions. MathScale exhibits effective scalability along the size axis of the math dataset that we generate. As a result, we create a mathematical reasoning dataset (MathScaleQA) containing two million math question-answer pairs. To evaluate mathematical reasoning abilities of LLMs comprehensively, we construct {\sc MwpBench}, a benchmark of Math Word Problems,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

microsoft/unilm/tree/master/mathscale
pytorchOfficial

Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMathematics Education and Teaching Techniques · Intelligent Tutoring Systems and Adaptive Learning

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · {Dispute@FaQ-s}How to file a dispute with Expedia? · Attention Is All You Need · Cosine Annealing · Linear Layer · Layer Normalization · Byte Pair Encoding · Dropout · Multi-Head Attention · Linear Warmup With Cosine Annealing