Physics of Language Models: Part 2.1, Grade-School Math and the Hidden   Reasoning Process

Tian Ye; Zicheng Xu; Yuanzhi Li; Zeyuan Allen-Zhu

arXiv:2407.20311·cs.AI·July 31, 2024·3 cites

Physics of Language Models: Part 2.1, Grade-School Math and the Hidden Reasoning Process

Tian Ye, Zicheng Xu, Yuanzhi Li, Zeyuan Allen-Zhu

PDF

Open Access 1 Repo 1 Models 1 Datasets

TL;DR

This paper investigates how language models solve grade-school math problems, revealing their hidden reasoning processes, whether they memorize or genuinely reason, and how model size impacts their problem-solving abilities.

Contribution

The study provides controlled experiments uncovering the hidden reasoning mechanisms of language models in solving math problems, beyond simple memorization.

Findings

01

Models develop reasoning skills beyond memorization.

02

Hidden mental processes influence problem-solving accuracy.

03

Model size and depth are critical for effective math reasoning.

Abstract

Recent advances in language models have demonstrated their capability to solve mathematical reasoning problems, achieving near-perfect accuracy on grade-school level math benchmarks like GSM8K. In this paper, we formally study how language models solve these problems. We design a series of controlled experiments to address several fundamental questions: (1) Can language models truly develop reasoning skills, or do they simply memorize templates? (2) What is the model's hidden (mental) reasoning process? (3) Do models solve math questions using skills similar to or different from humans? (4) Do models trained on GSM8K-like datasets develop reasoning skills beyond those necessary for solving GSM8K problems? (5) What mental process causes models to make reasoning mistakes? (6) How large or deep must a model be to effectively solve GSM8K-level math questions? Our study uncovers many…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Infini-AI-Lab/gsm_infinite
none

Models

🤗
fzmnm/TinyStoriesAdv_v2_92M
model· 3 dl· ♡ 1
3 dl♡ 1

Datasets

fzmnm/TinyStoriesAdv-zh
dataset· 105 dl
105 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEducational Tools and Methods · Mathematics Education and Teaching Techniques · Teaching and Learning Programming