Learning to Initialize: Can Meta Learning Improve Cross-task   Generalization in Prompt Tuning?

Chengwei Qin; Qian Li; Ruochen Zhao; Shafiq Joty

arXiv:2302.08143·cs.CL·November 21, 2023

Learning to Initialize: Can Meta Learning Improve Cross-task Generalization in Prompt Tuning?

Chengwei Qin, Qian Li, Ruochen Zhao, Shafiq Joty

PDF

Open Access

TL;DR

This paper investigates how meta-learning can enhance cross-task generalization in prompt tuning by learning to initialize prompt embeddings, showing significant improvements especially in classification tasks.

Contribution

It systematically explores meta prompt tuning (MPT) with various algorithms across diverse tasks, demonstrating its effectiveness over standard prompt tuning.

Findings

01

MPT significantly improves performance on classification tasks.

02

MPT outperforms PT in most cases for question answering.

03

Task similarity influences MPT effectiveness.

Abstract

Prompt tuning (PT) which only tunes the embeddings of an additional sequence of tokens per task, keeping the pre-trained language model (PLM) frozen, has shown remarkable performance in few-shot learning. Despite this, PT has been shown to rely heavily on good initialization of the prompt embeddings. In this work, we study meta prompt tuning (MPT) to systematically explore how meta-learning can help improve (if it can) cross-task generalization in PT through learning to initialize the prompt embeddings from other relevant tasks. We empirically analyze a representative set of meta learning algorithms in a wide range of adaptation settings with different source/target task configurations on a large set of few-shot tasks. With extensive experiments and analysis, we demonstrate the effectiveness of MPT. We find the improvement to be significant particularly on classification tasks. For…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Topic Modeling · Multimodal Machine Learning Applications