A Primer on Contrastive Pretraining in Language Processing: Methods,   Lessons Learned and Perspectives

Nils Rethmeier; Isabelle Augenstein

arXiv:2102.12982·cs.CL·February 26, 2021

A Primer on Contrastive Pretraining in Language Processing: Methods, Lessons Learned and Perspectives

Nils Rethmeier, Isabelle Augenstein

PDF

1 Datasets

TL;DR

This paper reviews contrastive pretraining methods in NLP, discussing their applications, challenges, and lessons learned, aiming to bridge the gap with successful image representation techniques.

Contribution

It provides a comprehensive survey of contrastive NLP pretraining methods, highlighting key concepts, applications, and future research directions.

Findings

01

Contrastive NLP methods improve language modeling and task performance.

02

Automated text augmentation remains a major challenge in NLP contrastive learning.

03

Lessons from image pretraining can inform future NLP contrastive approaches.

Abstract

Modern natural language processing (NLP) methods employ self-supervised pretraining objectives such as masked language modeling to boost the performance of various application tasks. These pretraining methods are frequently extended with recurrence, adversarial or linguistic property masking, and more recently with contrastive learning objectives. Contrastive self-supervised training objectives enabled recent successes in image representation pretraining by learning to contrast input-input pairs of augmented images as either similar or dissimilar. However, in NLP, automated creation of text input augmentations is still very challenging because a single token can invert the meaning of a sentence. For this reason, some contrastive NLP pretraining methods contrast over input-label pairs, rather than over input-input pairs, using methods from Metric Learning and Energy Based Models. In this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

BAAI/SurveyScope
dataset· 6 dl
6 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsContrastive Learning