Learning From Mistakes Makes LLM Better Reasoner
Shengnan An, Zexiong Ma, Zeqi Lin, Nanning Zheng, Jian-Guang Lou,, Weizhu Chen

TL;DR
This paper introduces LEMA, a fine-tuning approach that enhances large language models' reasoning by learning from their mistakes through correction data, inspired by human error-driven learning.
Contribution
LEMA is a novel fine-tuning method that incorporates mistake correction data to improve LLM reasoning capabilities, demonstrating effectiveness across various models and tasks.
Findings
LEMA improves reasoning performance over standard fine-tuning.
Correction data has a non-homogeneous effect compared to chain-of-thought data.
The approach effectively expands question sets for better learning.
Abstract
Large language models (LLMs) recently exhibited remarkable reasoning capabilities on solving math problems. To further improve their reasoning capabilities, this work explores whether LLMs can LEarn from MistAkes (LEMA), akin to the human learning process. Consider a human student who failed to solve a math problem, he will learn from what mistake he has made and how to correct it. Mimicking this error-driven learning process, LEMA incorporates mistake-correction data pairs during fine-tuning LLMs. Specifically, we first collect inaccurate reasoning paths from various LLMs, and then employ GPT-4 as a ''corrector'' to identify the mistake step, explain the reason for the mistake, correct the mistake and generate the final answer. In addition, we apply a correction-centric evolution strategy that effectively expands the question set for generating correction data. Experiments across…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Law · Law, AI, and Intellectual Property
MethodsSparse Evolutionary Training · Multi-Head Attention · Attention Is All You Need · Linear Layer · Dropout · Residual Connection · Byte Pair Encoding · Dense Connections · Layer Normalization · Label Smoothing
