FPT: Improving Prompt Tuning Efficiency via Progressive Training

Yufei Huang; Yujia Qin; Huadong Wang; Yichun Yin; Maosong Sun; Zhiyuan; Liu; Qun Liu

arXiv:2211.06840·cs.CL·November 15, 2022·1 cites

FPT: Improving Prompt Tuning Efficiency via Progressive Training

Yufei Huang, Yujia Qin, Huadong Wang, Yichun Yin, Maosong Sun, Zhiyuan, Liu, Qun Liu

PDF

Open Access 1 Repo

TL;DR

This paper introduces Fast Prompt Tuning (FPT), a method that enhances prompt tuning efficiency by progressively expanding partial pre-trained language models and reusing learned prompts, reducing training costs while maintaining performance.

Contribution

The paper proposes a novel progressive training approach for prompt tuning that leverages transferability of soft prompts across partial models, significantly improving training efficiency.

Findings

01

FPT saves over 30% training computations.

02

FPT achieves comparable performance to standard prompt tuning.

03

FPT is effective across multiple tasks.

Abstract

Recently, prompt tuning (PT) has gained increasing attention as a parameter-efficient way of tuning pre-trained language models (PLMs). Despite extensively reducing the number of tunable parameters and achieving satisfying performance, PT is training-inefficient due to its slow convergence. To improve PT's training efficiency, we first make some novel observations about the prompt transferability of "partial PLMs", which are defined by compressing a PLM in depth or width. We observe that the soft prompts learned by different partial PLMs of various sizes are similar in the parameter space, implying that these soft prompts could potentially be transferred among partial PLMs. Inspired by these observations, we propose Fast Prompt Tuning (FPT), which starts by conducting PT using a small-scale partial PLM, and then progressively expands its depth and width until the full-model size. After…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

thunlp/fastprompttuning
jaxOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis