Learning a Better Initialization for Soft Prompts via Meta-Learning

Yukun Huang; Kun Qian; Zhou Yu

arXiv:2205.12471·cs.CL·May 26, 2022·5 cites

Learning a Better Initialization for Soft Prompts via Meta-Learning

Yukun Huang, Kun Qian, Zhou Yu

PDF

Open Access

TL;DR

MetaPT enhances prompt tuning for language models by using meta-learning and data clustering to find better initial prompts, leading to improved performance across multiple tasks.

Contribution

This paper introduces MetaPT, a novel meta-learning approach that leverages data clustering to improve prompt initialization for better few-shot learning.

Findings

01

MetaPT outperforms existing methods on seven downstream tasks.

02

MetaPT provides more stable and consistent performance.

03

Clustering pre-training data helps discover commonalities that improve prompt initialization.

Abstract

Prompt tuning (PT) is an effective approach to adapting pre-trained language models to downstream tasks. Without a good initialization, prompt tuning doesn't perform well under few-shot settings. So pre-trained prompt tuning (PPT) is proposed to initialize prompts by leveraging pre-training data. We propose MetaPT (Meta-learned Prompt Tuning) to further improve PPT's initialization by considering latent structure within the pre-training data. Specifically, we introduce the structure by first clustering pre-training data into different auxiliary tasks with unsupervised methods. Then we use these tasks to pre-train prompts with a meta-learning algorithm. Such a process can make prompts learn a better initialization by discovering commonalities among these auxiliary tasks. We evaluate our method on seven downstream tasks. Our MetaPT achieves better and more stable performance than the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications