Don't Stop Pretraining? Make Prompt-based Fine-tuning Powerful Learner

Zhengxiang Shi; Aldo Lipani

arXiv:2305.01711·cs.CL·October 9, 2023·6 cites

Don't Stop Pretraining? Make Prompt-based Fine-tuning Powerful Learner

Zhengxiang Shi, Aldo Lipani

PDF

Open Access 2 Repos

TL;DR

This paper introduces Prompt-based Continued Pre-training (PCP), a method that enhances prompt-based fine-tuning of language models by combining instruction tuning with pre-training, leading to significant performance improvements across various NLP tasks.

Contribution

The paper proposes PCP, a novel pre-training approach that improves prompt-based fine-tuning performance, outperforming existing methods without requiring iterative processes or additional data augmentation.

Findings

01

PCP improves prompt-based fine-tuning performance by up to 20.1%.

02

Conventional continued pre-training does not always benefit downstream tasks.

03

PCP outperforms state-of-the-art semi-supervised methods with less complexity.

Abstract

Language models (LMs) trained on vast quantities of unlabelled data have greatly advanced the field of natural language processing (NLP). In this study, we re-visit the widely accepted notion in NLP that continued pre-training LMs on task-related texts improves the performance of fine-tuning (FT) in downstream tasks. Through experiments on eight single-sentence tasks and eight sentence-pair tasks in both semi-supervised and fully-supervised settings, we find that conventional continued pre-training does not consistently provide benefits and can even be detrimental for sentence-pair tasks or when prompt-based FT is used. To tackle these issues, we propose Prompt-based Continued Pre-training (PCP), which combines the idea of instruction tuning with conventional continued pre-training. Our approach aims to improve the performance of prompt-based FT by presenting both task-related texts and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification