Instruction Tuning for Large Language Models: A Survey

Shengyu Zhang; Linfeng Dong; Xiaoya Li; Sen Zhang; Xiaofei Sun; Shuhe Wang; Jiwei Li; Runyi Hu; Tianwei Zhang; Fei Wu; Guoyin Wang

arXiv:2308.10792·cs.CL·October 7, 2025·97 cites

Instruction Tuning for Large Language Models: A Survey

Shengyu Zhang, Linfeng Dong, Xiaoya Li, Sen Zhang, Xiaofei Sun, Shuhe Wang, Jiwei Li, Runyi Hu, Tianwei Zhang, Fei Wu, Guoyin Wang

PDF

Open Access 1 Repo

TL;DR

This survey comprehensively reviews instruction tuning for large language models, covering methodologies, dataset construction, applications, challenges, and future research directions in this rapidly evolving field.

Contribution

It provides a systematic overview of instruction tuning techniques, datasets, applications, and critical analysis of current challenges and future research avenues.

Findings

01

Instruction tuning enhances LLM capabilities and controllability.

02

Dataset size and quality significantly impact SFT outcomes.

03

Current strategies have notable deficiencies and areas for improvement.

Abstract

This paper surveys research works in the quickly advancing field of instruction tuning (IT), which can also be referred to as supervised fine-tuning (SFT)\footnote{In this paper, unless specified otherwise, supervised fine-tuning (SFT) and instruction tuning (IT) are used interchangeably.}, a crucial technique to enhance the capabilities and controllability of large language models (LLMs). Instruction tuning refers to the process of further training LLMs on a dataset consisting of \textsc{(instruction, output)} pairs in a supervised fashion, which bridges the gap between the next-word prediction objective of LLMs and the users' objective of having LLMs adhere to human instructions. In this work, we make a systematic review of the literature, including the general methodology of SFT, the construction of SFT datasets, the training of SFT models, and applications to different modalities,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

xiaoya-li/instruction-tuning-survey
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications