NSP-BERT: A Prompt-based Few-Shot Learner Through an Original   Pre-training Task--Next Sentence Prediction

Yi Sun; Yu Zheng; Chao Hao; Hangping Qiu

arXiv:2109.03564·cs.CL·October 19, 2022·30 cites

NSP-BERT: A Prompt-based Few-Shot Learner Through an Original Pre-training Task--Next Sentence Prediction

Yi Sun, Yu Zheng, Chao Hao, Hangping Qiu

PDF

Open Access 1 Repo 3 Models

TL;DR

NSP-BERT introduces a novel sentence-level prompt-based approach utilizing the Next Sentence Prediction task, enabling effective zero-shot NLP task performance without fixed prompt length or position constraints.

Contribution

The paper presents NSP-BERT, a new prompt-based method leveraging the NSP task for zero-shot learning, overcoming token-level limitations of existing prompt techniques.

Findings

01

Outperforms other zero-shot methods on FewCLUE benchmark

02

Handles tasks like entity linking with ease

03

Close to few-shot performance levels

Abstract

Using prompts to utilize language models to perform various downstream tasks, also known as prompt-based learning or prompt-learning, has lately gained significant success in comparison to the pre-train and fine-tune paradigm. Nonetheless, virtually all prompt-based methods are token-level, meaning they all utilize GPT's left-to-right language model or BERT's masked language model to perform cloze-style tasks. In this paper, we attempt to accomplish several NLP tasks in the zero-shot scenario using a BERT original pre-training task abandoned by RoBERTa and other models--Next Sentence Prediction (NSP). Unlike token-level techniques, our sentence-level prompt-based method NSP-BERT does not need to fix the length of the prompt or the position to be predicted, allowing it to handle tasks such as entity linking with ease. Based on the characteristics of NSP-BERT, we offer several quick…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sunyilgdx/prompts4keras
tfOfficial

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Dropout · Softmax · Attention Dropout · WordPiece · Refunds@Expedia|||How do I get a full refund from Expedia? · Layer Normalization · Dense Connections