On Transferability of Prompt Tuning for Natural Language Processing
Yusheng Su, Xiaozhi Wang, Yujia Qin, Chi-Min Chan, Yankai Lin, Huadong, Wang, Kaiyue Wen, Zhiyuan Liu, Peng Li, Juanzi Li, Lei Hou, Maosong Sun, Jie, Zhou

TL;DR
This paper investigates the transferability of prompt tuning in NLP, showing that soft prompts can effectively transfer across tasks and models, and that prompt stimulation patterns influence transfer success, enhancing efficiency.
Contribution
It empirically demonstrates prompt transferability across tasks and models, and identifies neuron activation overlap as a key factor influencing transfer success.
Findings
Soft prompts transfer effectively across similar tasks and models.
Using transferred prompts accelerates training and improves performance.
Neuron activation overlap correlates with transferability.
Abstract
Prompt tuning (PT) is a promising parameter-efficient method to utilize extremely large pre-trained language models (PLMs), which can achieve comparable performance to full-parameter fine-tuning by only tuning a few soft prompts. However, PT requires much more training time than fine-tuning. Intuitively, knowledge transfer can help to improve the efficiency. To explore whether we can improve PT via prompt transfer, we empirically investigate the transferability of soft prompts across different downstream tasks and PLMs in this work. We find that (1) in zero-shot setting, trained soft prompts can effectively transfer to similar tasks on the same PLM and also to other PLMs with a cross-model projector trained on similar tasks; (2) when used as initialization, trained soft prompts of similar tasks and projected prompts of other PLMs can significantly accelerate training and also improve…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Ferroelectric and Negative Capacitance Devices · Advanced Neural Network Applications
