LLMs-as-Instructors: Learning from Errors Toward Automating Model   Improvement

Jiahao Ying; Mingbao Lin; Yixin Cao; Wei Tang; Bo Wang; Qianru Sun,; Xuanjing Huang; Shuicheng Yan

arXiv:2407.00497·cs.CL·July 2, 2024

LLMs-as-Instructors: Learning from Errors Toward Automating Model Improvement

Jiahao Ying, Mingbao Lin, Yixin Cao, Wei Tang, Bo Wang, Qianru Sun,, Xuanjing Huang, Shuicheng Yan

PDF

Open Access

TL;DR

This paper presents a novel framework where large language models act as instructors to improve smaller models by analyzing errors and applying targeted training strategies, leading to significant performance gains.

Contribution

The paper introduces a new LLM-based instructor framework utilizing error analysis and contrastive learning to enhance smaller models' capabilities, outperforming existing methods.

Findings

01

Refined Llama-3-8b-Instruction surpasses ChatGPT in benchmarks.

02

Error-focused training improves mathematical reasoning and coding.

03

Contrastive learning yields balanced in-domain and out-of-domain performance.

Abstract

This paper introduces the innovative "LLMs-as-Instructors" framework, which leverages the advanced Large Language Models (LLMs) to autonomously enhance the training of smaller target models. Inspired by the theory of "Learning from Errors", this framework employs an instructor LLM to meticulously analyze the specific errors within a target model, facilitating targeted and efficient training cycles. Within this framework, we implement two strategies: "Learning from Error," which focuses solely on incorrect responses to tailor training data, and "Learning from Error by Contrast", which uses contrastive learning to analyze both correct and incorrect responses for a deeper understanding of errors. Our empirical studies, conducted with several open-source models, demonstrate significant improvements across multiple benchmarks, including mathematical reasoning, coding abilities, and factual…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Law

MethodsContrastive Learning