CSS-LM: A Contrastive Framework for Semi-supervised Fine-tuning of   Pre-trained Language Models

Yusheng Su; Xu Han; Yankai Lin; Zhengyan Zhang; Zhiyuan Liu; Peng Li,; Jie Zhou; Maosong Sun

arXiv:2102.03752·cs.CL·November 15, 2021

CSS-LM: A Contrastive Framework for Semi-supervised Fine-tuning of Pre-trained Language Models

Yusheng Su, Xu Han, Yankai Lin, Zhengyan Zhang, Zhiyuan Liu, Peng Li,, Jie Zhou, Maosong Sun

PDF

1 Repo

TL;DR

CSS-LM introduces a contrastive semi-supervised framework that enhances the fine-tuning of pre-trained language models in low-resource scenarios by leveraging unlabeled data to better capture task-specific semantic features.

Contribution

The paper presents a novel contrastive semi-supervised learning framework for fine-tuning PLMs, improving performance in low-resource NLP tasks over traditional methods.

Findings

01

CSS-LM outperforms conventional fine-tuning in few-shot settings

02

It surpasses recent supervised contrastive fine-tuning strategies

03

Achieves better semantic feature capture for downstream tasks

Abstract

Fine-tuning pre-trained language models (PLMs) has demonstrated its effectiveness on various downstream NLP tasks recently. However, in many low-resource scenarios, the conventional fine-tuning strategies cannot sufficiently capture the important semantic features for downstream tasks. To address this issue, we introduce a novel framework (named "CSS-LM") to improve the fine-tuning phase of PLMs via contrastive semi-supervised learning. Specifically, given a specific task, we retrieve positive and negative instances from large-scale unlabeled corpora according to their domain-level and class-level semantic relatedness to the task. We then perform contrastive semi-supervised learning on both the retrieved unlabeled and original labeled instances to help PLMs capture crucial task-related semantic features. The experimental results show that CSS-LM achieves better results than the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

thunlp/CSS-LM
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.