Reflection-Tuning: Data Recycling Improves LLM Instruction-Tuning

Ming Li; Lichang Chen; Jiuhai Chen; Shwai He; Heng Huang; Jiuxiang Gu,; Tianyi Zhou

arXiv:2310.11716·cs.CL·October 19, 2023·1 cites

Reflection-Tuning: Data Recycling Improves LLM Instruction-Tuning

Ming Li, Lichang Chen, Jiuhai Chen, Shwai He, Heng Huang, Jiuxiang Gu,, Tianyi Zhou

PDF

Open Access 2 Repos 4 Models

TL;DR

This paper introduces reflection-tuning, a method where LLMs improve their training data quality through self-assessment and data recycling, leading to better instruction-tuned LLM performance.

Contribution

It presents a novel data recycling approach using LLMs' self-judgment to enhance instruction tuning datasets, improving model performance.

Findings

01

Recycled data improves LLM instruction tuning performance.

02

Models trained with recycled data outperform those with original datasets.

03

Reflection-tuning enhances data quality through self-assessment.

Abstract

Recent advancements in Large Language Models (LLMs) have expanded the horizons of natural language understanding and generation. Notably, the output control and alignment with the input of LLMs can be refined through instruction tuning. However, as highlighted in several studies, low-quality data in the training set are usually detrimental to instruction tuning, resulting in inconsistent or even misleading LLM outputs. We propose a novel method, termed "reflection-tuning," which addresses the problem by self-improvement and judging capabilities of LLMs. This approach utilizes an oracle LLM to recycle the original training data by introspecting and enhancing the quality of instructions and responses in the data. Extensive experiments on widely used evaluation benchmarks show that LLMs trained with our recycled data outperform those trained with existing datasets in various benchmarks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification

MethodsSparse Evolutionary Training