MuggleMath: Assessing the Impact of Query and Response Augmentation on   Math Reasoning

Chengpeng Li; Zheng Yuan; Hongyi Yuan; Guanting Dong; Keming Lu,; Jiancan Wu; Chuanqi Tan; Xiang Wang; Chang Zhou

arXiv:2310.05506·cs.CL·July 18, 2024·1 cites

MuggleMath: Assessing the Impact of Query and Response Augmentation on Math Reasoning

Chengpeng Li, Zheng Yuan, Hongyi Yuan, Guanting Dong, Keming Lu,, Jiancan Wu, Chuanqi Tan, Xiang Wang, Chang Zhou

PDF

Open Access 1 Repo 2 Models 1 Video

TL;DR

This paper investigates how query and response augmentation strategies improve math reasoning in large language models, demonstrating state-of-the-art performance and analyzing data scaling and generalization limitations.

Contribution

It introduces new diversified datasets and fine-tunes LLaMA models to enhance math reasoning, providing insights into data augmentation effects and generalization challenges.

Findings

01

Augmentation significantly improves performance on GSM8K and MATH.

02

A log-linear relationship exists between data amount and performance.

03

Limited out-of-domain generalization indicates need for broader subject coverage.

Abstract

In math reasoning with large language models (LLMs), fine-tuning data augmentation by query evolution and diverse reasoning paths is empirically verified effective, profoundly narrowing the gap between open-sourced LLMs and cutting-edge proprietary LLMs. In this paper, we conduct an investigation for such data augmentation in math reasoning and are intended to answer: (1) What strategies of data augmentation are more effective; (2) What is the scaling relationship between the amount of augmented data and model performance; and (3) Can data augmentation incentivize generalization to out-of-domain mathematical reasoning tasks? To this end, we create two new dataset AugGSM8K and AugMATH, by complicating and diversifying the queries and sampling multiple reasoning paths from GSM8K and MATH. We obtained a series of LLMs called MuggleMath by fine-tuning LLaMA models on AugGSM8K and AugMATH.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ofa-sys/gsm8k-screl
pytorchOfficial

Models

Videos

MuggleMath: Assessing the Impact of Query and Response Augmentation on Math Reasoning· underline

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Software Engineering Research

MethodsLLaMA