Supervised Contrastive Learning for Pre-trained Language Model   Fine-tuning

Beliz Gunel; Jingfei Du; Alexis Conneau; Ves Stoyanov

arXiv:2011.01403·cs.CL·April 6, 2021·60 cites

Supervised Contrastive Learning for Pre-trained Language Model Fine-tuning

Beliz Gunel, Jingfei Du, Alexis Conneau, Ves Stoyanov

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a supervised contrastive learning objective for fine-tuning pre-trained language models, improving generalization, robustness, and performance in few-shot settings without extra data or architecture changes.

Contribution

It proposes a novel supervised contrastive loss for fine-tuning, enhancing model performance and robustness over traditional cross-entropy loss in NLP tasks.

Findings

01

Significant improvements on GLUE benchmark in few-shot learning.

02

Enhanced robustness to noise in training data.

03

Better generalization to related tasks with limited labels.

Abstract

State-of-the-art natural language understanding classification models follow two-stages: pre-training a large language model on an auxiliary task, and then fine-tuning the model on a task-specific labeled dataset using cross-entropy loss. However, the cross-entropy loss has several shortcomings that can lead to sub-optimal generalization and instability. Driven by the intuition that good generalization requires capturing the similarity between examples in one class and contrasting them with examples in other classes, we propose a supervised contrastive learning (SCL) objective for the fine-tuning stage. Combined with cross-entropy, our proposed SCL loss obtains significant improvements over a strong RoBERTa-Large baseline on multiple datasets of the GLUE benchmark in few-shot learning settings, without requiring specialized architecture, data augmentations, memory banks, or additional…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sl-93/SUPERVISED-CONTRASTIVE-LEARNING-FOR-PRE-TRAINED-LANGUAGE-MODEL-FINE-TUNING
pytorch

Videos

Supervised Contrastive Learning for Pre-trained Language Model Fine-tuning· slideslive

Taxonomy

TopicsTopic Modeling · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications

MethodsContrastive Learning