MetaMath: Bootstrap Your Own Mathematical Questions for Large Language   Models

Longhui Yu; Weisen Jiang; Han Shi; Jincheng Yu; Zhengying Liu; Yu; Zhang; James T. Kwok; Zhenguo Li; Adrian Weller; Weiyang Liu

arXiv:2309.12284·cs.CL·May 6, 2024·28 cites

MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models

Longhui Yu, Weisen Jiang, Han Shi, Jincheng Yu, Zhengying Liu, Yu, Zhang, James T. Kwok, Zhenguo Li, Adrian Weller, Weiyang Liu

PDF

Open Access 1 Repo 10 Models 5 Datasets

TL;DR

MetaMath introduces a fine-tuned LLaMA-2 model trained on a newly created dataset, MetaMathQA, which significantly improves mathematical reasoning performance on benchmarks like GSM8K and MATH.

Contribution

The paper presents MetaMath, a novel approach that bootstraps mathematical questions from multiple perspectives to enhance LLMs' reasoning abilities, outperforming existing open-source models.

Findings

01

MetaMath-7B achieves 66.4% on GSM8K and 19.4% on MATH.

02

MetaMath-70B achieves 82.3% accuracy on GSM8K, surpassing GPT-3.5-Turbo.

03

The MetaMathQA dataset and models are publicly released.

Abstract

Large language models (LLMs) have pushed the limits of natural language understanding and exhibited excellent problem-solving ability. Despite the great success, most existing open-source LLMs (e.g., LLaMA-2) are still far away from satisfactory for solving mathematical problem due to the complex reasoning procedures. To bridge this gap, we propose MetaMath, a fine-tuned language model that specializes in mathematical reasoning. Specifically, we start by bootstrapping mathematical questions by rewriting the question from multiple perspectives without extra knowledge, which results in a new dataset called MetaMathQA. Then we fine-tune the LLaMA-2 models on MetaMathQA. Experimental results on two popular benchmarks (i.e., GSM8K and MATH) for mathematical reasoning demonstrate that MetaMath outperforms a suite of open-source LLMs by a significant margin. Our MetaMath-7B model achieves…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

meta-math/MetaMath
pytorchOfficial

Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Linear Layer · Cosine Annealing · Weight Decay · Multi-Head Attention · Adam · Residual Connection · Attention Dropout