From Instance Training to Instruction Learning: Task Adapters Generation from Instructions
Huanxuan Liao, Shizhu He, Yao Xu, Yuanzhe Zhang, Yanchao Hao,, Shengping Liu, Kang Liu, Jun Zhao

TL;DR
This paper introduces TAGI, a method that generates task-specific adapters from instructions, enabling cross-task generalization without retraining, and outperforms traditional models while reducing computational costs.
Contribution
We propose TAGI, a novel instruction-based adapter generation method that enhances cross-task generalization and reduces training requirements for large language models.
Findings
TAGI matches or outperforms meta-trained models on instruction datasets.
TAGI significantly reduces computational costs compared to traditional methods.
The two-stage training process effectively endows TAGI with cross-task generalization.
Abstract
Large language models (LLMs) have acquired the ability to solve general tasks by utilizing instruction finetuning (IFT). However, IFT still relies heavily on instance training of extensive task data, which greatly limits the adaptability of LLMs to real-world scenarios where labeled task instances are scarce and broader task generalization becomes paramount. Contrary to LLMs, humans acquire skills and complete tasks not merely through repeated practice but also by understanding and following instructional guidelines. This paper is dedicated to simulating human learning to address the shortcomings of instance training, focusing on instruction learning to enhance cross-task generalization. Within this context, we introduce Task Adapters Generation from Instructions (TAGI), which automatically constructs the task-specific model in a parameter generation manner based on the given task…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsIntelligent Tutoring Systems and Adaptive Learning · Online Learning and Analytics
MethodsHyperNetwork · Adapter · Knowledge Distillation
