CoachLM: Automatic Instruction Revisions Improve the Data Quality in LLM   Instruction Tuning

Yilun Liu; Shimin Tao; Xiaofeng Zhao; Ming Zhu; Wenbing Ma; Junhao; Zhu; Chang Su; Yutai Hou; Miao Zhang; Min Zhang; Hongxia Ma; Li Zhang; Hao; Yang; Yanfei Jiang

arXiv:2311.13246·cs.CL·March 22, 2024·1 cites

CoachLM: Automatic Instruction Revisions Improve the Data Quality in LLM Instruction Tuning

Yilun Liu, Shimin Tao, Xiaofeng Zhao, Ming Zhu, Wenbing Ma, Junhao, Zhu, Chang Su, Yutai Hou, Miao Zhang, Min Zhang, Hongxia Ma, Li Zhang, Hao, Yang, Yanfei Jiang

PDF

Open Access 2 Repos

TL;DR

CoachLM is a novel method that automatically revises instruction datasets to significantly improve their quality, leading to better instruction-following performance in LLMs and more efficient data cleaning processes.

Contribution

It introduces CoachLM, a new approach for dataset enhancement through automatic revisions, increasing high-quality samples from 17.7% to 78.9% and boosting LLM performance.

Findings

01

High-quality instruction samples increased to 78.9%.

02

Instruction-following capabilities improved by 29.9%.

03

Data cleaning efficiency improved by 20%.

Abstract

Instruction tuning is crucial for enabling Language Learning Models (LLMs) in responding to human instructions. The quality of instruction pairs used for tuning greatly affects the performance of LLMs. However, the manual creation of high-quality instruction datasets is costly, leading to the adoption of automatic generation of instruction pairs by LLMs as a popular alternative. To ensure the high quality of LLM-generated instruction datasets, several approaches have been proposed. Nevertheless, existing methods either compromise dataset integrity by filtering a large proportion of samples, or are unsuitable for industrial applications. In this paper, instead of discarding low-quality samples, we propose CoachLM, a novel approach to enhance the quality of instruction datasets through automatic revisions on samples in the dataset. CoachLM is trained from the samples revised by human…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReal-time simulation and control systems

MethodsSparse Evolutionary Training