Task-adaptive Pre-training and Self-training are Complementary for Natural Language Understanding
Shiyang Li, Semih Yavuz, Wenhu Chen, Xifeng Yan

TL;DR
This paper demonstrates that task-adaptive pre-training and self-training are complementary semi-supervised methods for natural language understanding, and their combination via the TFS protocol yields significant performance improvements across multiple NLP tasks.
Contribution
The study introduces the TFS protocol, combining TAPT and self-training, showing their additive benefits and establishing a strong semi-supervised baseline for NLP tasks.
Findings
TFS consistently improves performance across six NLP datasets.
TAPT and ST are complementary and can be effectively combined.
Gains from TAPT and ST are strongly additive in various settings.
Abstract
Task-adaptive pre-training (TAPT) and Self-training (ST) have emerged as the major semi-supervised approaches to improve natural language understanding (NLU) tasks with massive amount of unlabeled data. However, it's unclear whether they learn similar representations or they can be effectively combined. In this paper, we show that TAPT and ST can be complementary with simple TFS protocol by following TAPT -> Finetuning -> Self-training (TFS) process. Experimental results show that TFS protocol can effectively utilize unlabeled data to achieve strong combined gains consistently across six datasets covering sentiment classification, paraphrase identification, natural language inference, named entity recognition and dialogue slot classification. We investigate various semi-supervised settings and consistently show that gains from TAPT and ST can be strongly additive by following TFS…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems
