Learning From Mistakes Makes LLM Better Reasoner

Shengnan An; Zexiong Ma; Zeqi Lin; Nanning Zheng; Jian-Guang Lou,; Weizhu Chen

arXiv:2310.20689·cs.CL·April 1, 2024·5 cites

Learning From Mistakes Makes LLM Better Reasoner

Shengnan An, Zexiong Ma, Zeqi Lin, Nanning Zheng, Jian-Guang Lou,, Weizhu Chen

PDF

Open Access 1 Repo

TL;DR

This paper introduces LEMA, a fine-tuning approach that enhances large language models' reasoning by learning from their mistakes through correction data, inspired by human error-driven learning.

Contribution

LEMA is a novel fine-tuning method that incorporates mistake correction data to improve LLM reasoning capabilities, demonstrating effectiveness across various models and tasks.

Findings

01

LEMA improves reasoning performance over standard fine-tuning.

02

Correction data has a non-homogeneous effect compared to chain-of-thought data.

03

The approach effectively expands question sets for better learning.

Abstract

Large language models (LLMs) recently exhibited remarkable reasoning capabilities on solving math problems. To further improve their reasoning capabilities, this work explores whether LLMs can LEarn from MistAkes (LEMA), akin to the human learning process. Consider a human student who failed to solve a math problem, he will learn from what mistake he has made and how to correct it. Mimicking this error-driven learning process, LEMA incorporates mistake-correction data pairs during fine-tuning LLMs. Specifically, we first collect inaccurate reasoning paths from various LLMs, and then employ GPT-4 as a ''corrector'' to identify the mistake step, explain the reason for the mistake, correct the mistake and generate the final answer. In addition, we apply a correction-centric evolution strategy that effectively expands the question set for generating correction data. Experiments across…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

microsoft/lema
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Law · Law, AI, and Intellectual Property

MethodsSparse Evolutionary Training · Multi-Head Attention · Attention Is All You Need · Linear Layer · Dropout · Residual Connection · Byte Pair Encoding · Dense Connections · Layer Normalization · Label Smoothing