SIaM: Self-Improving Code-Assisted Mathematical Reasoning of Large   Language Models

Dian Yu; Baolin Peng; Ye Tian; Linfeng Song; Haitao Mi; Dong Yu

arXiv:2408.15565·cs.CL·August 29, 2024

SIaM: Self-Improving Code-Assisted Mathematical Reasoning of Large Language Models

Dian Yu, Baolin Peng, Ye Tian, Linfeng Song, Haitao Mi, Dong Yu

PDF

Open Access

TL;DR

SIaM introduces a self-improving framework for large language models that enhances mathematical reasoning by leveraging diverse expert-written question-answer pairs and a code-based critic for continuous improvement.

Contribution

The paper presents a novel paradigm using a code-based critic and alignment algorithms to improve LLMs' mathematical reasoning with diverse data, addressing generalization issues.

Findings

01

Improves in-domain accuracy by up to 5.7%

02

Enhances out-of-domain performance by 4.4%

03

Effective across English and Chinese benchmarks

Abstract

There is a growing trend of teaching large language models (LLMs) to solve mathematical problems through coding. Existing studies primarily focus on prompting powerful, closed-source models to generate seed training data followed by in-domain data augmentation, equipping LLMs with considerable capabilities for code-aided mathematical reasoning. However, continually training these models on augmented data derived from a few datasets such as GSM8K may impair their generalization abilities and restrict their effectiveness to a narrow range of question types. Conversely, the potential of improving such LLMs by leveraging large-scale, expert-written, diverse math question-answer pairs remains unexplored. To utilize these resources and tackle unique challenges such as code response assessment, we propose a novel paradigm that uses a code-based critic model to guide steps including…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Mathematics, Computing, and Information Processing

MethodsFocus