MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning
Xiang Yue, Xingwei Qu, Ge Zhang, Yao Fu, Wenhao Huang, Huan Sun, Yu, Su, Wenhu Chen

TL;DR
MAmmoTH introduces a series of open-source large language models tailored for general math problem-solving, leveraging a hybrid of chain-of-thought and program-of-thought rationales to significantly outperform existing models.
Contribution
The paper presents MAmmoTH, a new open-source LLM series trained on a curated math instruction dataset with hybrid rationales, achieving state-of-the-art performance on multiple math reasoning benchmarks.
Findings
MAmmoTH models outperform existing open-source models by 16-32% on nine datasets.
MAmmoTH-7B reaches 33% on MATH, surpassing WizardMath by 23%.
MAmmoTH-34B achieves 44% on MATH, exceeding GPT-4's CoT results.
Abstract
We introduce MAmmoTH, a series of open-source large language models (LLMs) specifically tailored for general math problem-solving. The MAmmoTH models are trained on MathInstruct, our meticulously curated instruction tuning dataset. MathInstruct is compiled from 13 math datasets with intermediate rationales, six of which have rationales newly curated by us. It presents a unique hybrid of chain-of-thought (CoT) and program-of-thought (PoT) rationales, and also ensures extensive coverage of diverse fields in math. The hybrid of CoT and PoT not only unleashes the potential of tool use but also allows different thought processes for different math problems. As a result, the MAmmoTH series substantially outperform existing open-source models on nine mathematical reasoning datasets across all scales with an average accuracy gain between 16% and 32%. Remarkably, our MAmmoTH-7B model reaches 33%…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗TIGER-Lab/MAmmoTH-Coder-7Bmodel· 18 dl· ♡ 2718 dl♡ 27
- 🤗TIGER-Lab/MAmmoTH-7Bmodel· 113 dl· ♡ 8113 dl♡ 8
- 🤗TIGER-Lab/MAmmoTH-13Bmodel· 22 dl· ♡ 922 dl♡ 9
- 🤗TIGER-Lab/MAmmoTH-70Bmodel· 175 dl· ♡ 10175 dl♡ 10
- 🤗TIGER-Lab/MAmmoTH-Coder-34Bmodel· 150 dl· ♡ 7150 dl♡ 7
- 🤗TIGER-Lab/MAmmoTH-Coder-13Bmodel· 21 dl· ♡ 821 dl♡ 8
- 🤗TheBloke/MAmmoTH-Coder-34B-AWQmodel· 7 dl· ♡ 17 dl♡ 1
- 🤗TheBloke/MAmmoTH-Coder-34B-GPTQmodel· 14 dl· ♡ 214 dl♡ 2
- 🤗TheBloke/MAmmoTH-Coder-34B-GGUFmodel· 104 dl· ♡ 2104 dl♡ 2
- 🤗TheBloke/MAmmoTH-70B-GGUFmodel· 88 dl88 dl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Online Learning and Analytics · Intelligent Tutoring Systems and Adaptive Learning
