CoachLM: Automatic Instruction Revisions Improve the Data Quality in LLM Instruction Tuning
Yilun Liu, Shimin Tao, Xiaofeng Zhao, Ming Zhu, Wenbing Ma, Junhao, Zhu, Chang Su, Yutai Hou, Miao Zhang, Min Zhang, Hongxia Ma, Li Zhang, Hao, Yang, Yanfei Jiang

TL;DR
CoachLM is a novel method that automatically revises instruction datasets to significantly improve their quality, leading to better instruction-following performance in LLMs and more efficient data cleaning processes.
Contribution
It introduces CoachLM, a new approach for dataset enhancement through automatic revisions, increasing high-quality samples from 17.7% to 78.9% and boosting LLM performance.
Findings
High-quality instruction samples increased to 78.9%.
Instruction-following capabilities improved by 29.9%.
Data cleaning efficiency improved by 20%.
Abstract
Instruction tuning is crucial for enabling Language Learning Models (LLMs) in responding to human instructions. The quality of instruction pairs used for tuning greatly affects the performance of LLMs. However, the manual creation of high-quality instruction datasets is costly, leading to the adoption of automatic generation of instruction pairs by LLMs as a popular alternative. To ensure the high quality of LLM-generated instruction datasets, several approaches have been proposed. Nevertheless, existing methods either compromise dataset integrity by filtering a large proportion of samples, or are unsuitable for industrial applications. In this paper, instead of discarding low-quality samples, we propose CoachLM, a novel approach to enhance the quality of instruction datasets through automatic revisions on samples in the dataset. CoachLM is trained from the samples revised by human…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReal-time simulation and control systems
MethodsSparse Evolutionary Training
