Don't Stop Pretraining? Make Prompt-based Fine-tuning Powerful Learner
Zhengxiang Shi, Aldo Lipani

TL;DR
This paper introduces Prompt-based Continued Pre-training (PCP), a method that enhances prompt-based fine-tuning of language models by combining instruction tuning with pre-training, leading to significant performance improvements across various NLP tasks.
Contribution
The paper proposes PCP, a novel pre-training approach that improves prompt-based fine-tuning performance, outperforming existing methods without requiring iterative processes or additional data augmentation.
Findings
PCP improves prompt-based fine-tuning performance by up to 20.1%.
Conventional continued pre-training does not always benefit downstream tasks.
PCP outperforms state-of-the-art semi-supervised methods with less complexity.
Abstract
Language models (LMs) trained on vast quantities of unlabelled data have greatly advanced the field of natural language processing (NLP). In this study, we re-visit the widely accepted notion in NLP that continued pre-training LMs on task-related texts improves the performance of fine-tuning (FT) in downstream tasks. Through experiments on eight single-sentence tasks and eight sentence-pair tasks in both semi-supervised and fully-supervised settings, we find that conventional continued pre-training does not consistently provide benefits and can even be detrimental for sentence-pair tasks or when prompt-based FT is used. To tackle these issues, we propose Prompt-based Continued Pre-training (PCP), which combines the idea of instruction tuning with conventional continued pre-training. Our approach aims to improve the performance of prompt-based FT by presenting both task-related texts and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification
