Multitask Prompt Tuning Enables Parameter-Efficient Transfer Learning
Zhen Wang, Rameswar Panda, Leonid Karlinsky, Rogerio Feris, Huan Sun,, Yoon Kim

TL;DR
Multitask prompt tuning (MPT) efficiently adapts large language models to multiple tasks by learning a shared prompt and task-specific updates, outperforming existing methods with minimal parameter tuning.
Contribution
Introduces MPT, a novel approach that distills cross-task knowledge into a shared prompt and uses low-rank updates for efficient task adaptation.
Findings
Outperforms state-of-the-art methods on 23 NLP datasets.
Achieves comparable or better results than full fine-tuning.
Uses only 0.035% of task-specific parameters.
Abstract
Prompt tuning, in which a base pretrained model is adapted to each task via conditioning on learned prompt vectors, has emerged as a promising approach for efficiently adapting large language models to multiple downstream tasks. However, existing methods typically learn soft prompt vectors from scratch, and it has not been clear how to exploit the rich cross-task knowledge with prompt vectors in a multitask learning setting. We propose multitask prompt tuning (MPT), which first learns a single transferable prompt by distilling knowledge from multiple task-specific source prompts. We then learn multiplicative low rank updates to this shared prompt to efficiently adapt it to each downstream target task. Extensive experiments on 23 NLP datasets demonstrate that our proposed approach outperforms the state-of-the-art methods, including the full finetuning baseline in some cases, despite only…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems
MethodsBalanced Selection
