Protum: A New Method For Prompt Tuning Based on "[MASK]"

Pan He; Yuxi Chen; Yan Wang; Yanru Zhang

arXiv:2201.12109·cs.CL·January 31, 2022·1 cites

Protum: A New Method For Prompt Tuning Based on "[MASK]"

Pan He, Yuxi Chen, Yan Wang, Yanru Zhang

PDF

Open Access

TL;DR

Protum introduces a novel prompt tuning approach leveraging '[MASK]' tokens' hidden layer information to directly predict labels, outperforming fine-tuning with less time and computational resources.

Contribution

The paper proposes Protum, a prompt tuning method that directly predicts labels from '[MASK]' hidden layers, addressing token composition issues in multi-word predictions.

Findings

01

Protum outperforms traditional fine-tuning methods.

02

It achieves higher accuracy with less training time.

03

Different hidden layers impact classification performance.

Abstract

Recently, prompt tuning \cite{lester2021power} has gradually become a new paradigm for NLP, which only depends on the representation of the words by freezing the parameters of pre-trained language models (PLMs) to obtain remarkable performance on downstream tasks. It maintains the consistency of Masked Language Model (MLM) \cite{devlin2018bert} task in the process of pre-training, and avoids some issues that may happened during fine-tuning. Naturally, we consider that the "[MASK]" tokens carry more useful information than other tokens because the model combines with context to predict the masked tokens. Among the current prompt tuning methods, there will be a serious problem of random composition of the answer tokens in prediction when they predict multiple words so that they have to map tokens to labels with the help verbalizer. In response to the above issue, we propose a new…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications