Fine-Tuning Pre-trained Language Model with Weak Supervision: A   Contrastive-Regularized Self-Training Approach

Yue Yu; Simiao Zuo; Haoming Jiang; Wendi Ren; Tuo Zhao; Chao Zhang

arXiv:2010.07835·cs.CL·April 1, 2021·5 cites

Fine-Tuning Pre-trained Language Model with Weak Supervision: A Contrastive-Regularized Self-Training Approach

Yue Yu, Simiao Zuo, Haoming Jiang, Wendi Ren, Tuo Zhao, Chao Zhang

PDF

Open Access 1 Repo

TL;DR

This paper introduces COSINE, a contrastive self-training framework that enables fine-tuning pre-trained language models using only weak supervision, effectively reducing overfitting and improving performance across multiple NLP tasks.

Contribution

The paper proposes a novel contrastive self-training approach with regularization and reweighting to fine-tune language models without labeled data, outperforming baselines on various benchmarks.

Findings

01

Outperforms strong baselines on 7 benchmarks across 6 tasks

02

Achieves performance comparable to fully-supervised fine-tuning

03

Effectively suppresses error propagation during training

Abstract

Fine-tuned pre-trained language models (LMs) have achieved enormous success in many natural language processing (NLP) tasks, but they still require excessive labeled data in the fine-tuning stage. We study the problem of fine-tuning pre-trained LMs using only weak supervision, without any labeled data. This problem is challenging because the high capacity of LMs makes them prone to overfitting the noisy labels generated by weak supervision. To address this problem, we develop a contrastive self-training framework, COSINE, to enable fine-tuning LMs with weak supervision. Underpinned by contrastive regularization and confidence-based reweighting, this contrastive self-training framework can gradually improve model fitting while effectively suppressing error propagation. Experiments on sequence, token, and sentence pair classification tasks show that our model outperforms the strongest…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yueyu1030/COSINE
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications