Self-Instruct: Aligning Language Models with Self-Generated Instructions

Yizhong Wang; Yeganeh Kordi; Swaroop Mishra; Alisa Liu; Noah A. Smith,; Daniel Khashabi; Hannaneh Hajishirzi

arXiv:2212.10560·cs.CL·May 29, 2023·118 cites

Self-Instruct: Aligning Language Models with Self-Generated Instructions

Yizhong Wang, Yeganeh Kordi, Swaroop Mishra, Alisa Liu, Noah A. Smith,, Daniel Khashabi, Hannaneh Hajishirzi

PDF

Open Access 5 Repos 10 Models 5 Datasets

TL;DR

Self-Instruct is a method that improves language models' ability to follow instructions by generating and filtering its own training data, reducing reliance on human annotations and achieving performance comparable to models trained with private data.

Contribution

The paper introduces Self-Instruct, a novel self-supervised approach for instruction tuning that leverages model-generated data to enhance instruction-following capabilities.

Findings

01

33% improvement on Super-NaturalInstructions

02

Performs comparably to InstructGPT-001

03

Outperforms existing public instruction datasets

Abstract

Large "instruction-tuned" language models (i.e., finetuned to respond to instructions) have demonstrated a remarkable ability to generalize zero-shot to new tasks. Nevertheless, they depend heavily on human-written instruction data that is often limited in quantity, diversity, and creativity, therefore hindering the generality of the tuned model. We introduce Self-Instruct, a framework for improving the instruction-following capabilities of pretrained language models by bootstrapping off their own generations. Our pipeline generates instructions, input, and output samples from a language model, then filters invalid or similar ones before using them to finetune the original model. Applying our method to the vanilla GPT3, we demonstrate a 33% absolute improvement over the original model on Super-NaturalInstructions, on par with the performance of InstructGPT-001, which was trained with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Machine Learning and Data Classification · Machine Learning and Algorithms