Distilling Instruction-following Abilities of Large Language Models with   Task-aware Curriculum Planning

Yuanhao Yue; Chengyu Wang; Jun Huang; Peng Wang

arXiv:2405.13448·cs.CL·October 4, 2024

Distilling Instruction-following Abilities of Large Language Models with Task-aware Curriculum Planning

Yuanhao Yue, Chengyu Wang, Jun Huang, Peng Wang

PDF

Open Access 4 Models

TL;DR

This paper introduces TAPIR, a curriculum-based distillation method that improves large language models' instruction-following abilities by systematically selecting and balancing task difficulties, leading to superior performance with less data.

Contribution

The paper proposes TAPIR, a novel multi-round distillation framework that incorporates task-aware curriculum planning and automatic response refinement to enhance LLM instruction tuning.

Findings

01

Student LLMs outperform larger models on benchmarks.

02

TAPIR achieves better results with less training data.

03

Curriculum planning systematically improves instruction-following abilities.

Abstract

Instruction tuning aims to align large language models (LLMs) with open-domain instructions and human-preferred responses. While several studies have explored autonomous approaches to distilling and annotating instructions from powerful proprietary LLMs, such as ChatGPT, they often neglect the impact of the distributions and characteristics of tasks, together with the varying difficulty of instructions in training sets. This oversight can lead to imbalanced knowledge capabilities and poor generalization powers of student LLMs. To address these challenges, we introduce Task-Aware Curriculum Planning for Instruction Refinement (TAPIR), a multi-round distillation framework that utilizes an oracle LLM to select instructions that are difficult for a student LLM to follow. To balance the student's capabilities, task distributions in training sets are adjusted with responses automatically…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsOnline Learning and Analytics · Intelligent Tutoring Systems and Adaptive Learning · Natural Language Processing Techniques

MethodsALIGN