ConSERT: A Contrastive Framework for Self-Supervised Sentence   Representation Transfer

Yuanmeng Yan; Rumei Li; Sirui Wang; Fuzheng Zhang; Wei Wu; Weiran; Xu

arXiv:2105.11741·cs.CL·May 26, 2021·47 cites

ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer

Yuanmeng Yan, Rumei Li, Sirui Wang, Fuzheng Zhang, Wei Wu, Weiran, Xu

PDF

Open Access 1 Repo

TL;DR

ConSERT introduces a contrastive learning framework to fine-tune BERT for better unsupervised sentence representations, significantly improving performance on semantic textual similarity tasks and demonstrating robustness with limited data.

Contribution

The paper presents ConSERT, a novel contrastive learning approach that enhances BERT-derived sentence representations without supervision, addressing the collapse issue and improving downstream task performance.

Findings

01

Achieves 8% relative improvement over previous state-of-the-art on STS datasets.

02

Attains new state-of-the-art performance when incorporating NLI supervision.

03

Performs well with only 1000 samples, showing robustness in data-scarce scenarios.

Abstract

Learning high-quality sentence representations benefits a wide range of natural language processing tasks. Though BERT-based pre-trained language models achieve high performance on many downstream tasks, the native derived sentence representations are proved to be collapsed and thus produce a poor performance on the semantic textual similarity (STS) tasks. In this paper, we present ConSERT, a Contrastive Framework for Self-Supervised Sentence Representation Transfer, that adopts contrastive learning to fine-tune BERT in an unsupervised and effective way. By making use of unlabeled texts, ConSERT solves the collapse issue of BERT-derived sentence representations and make them more applicable for downstream tasks. Experiments on STS datasets demonstrate that ConSERT achieves an 8\% relative improvement over the previous state-of-the-art, even comparable to the supervised SBERT-NLI. And…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yym6472/ConSERT
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications

MethodsAttention Is All You Need · Linear Layer · Contrastive Learning · WordPiece · Softmax · Layer Normalization · Dropout · Attention Dropout · Residual Connection · Refunds@Expedia|||How do I get a full refund from Expedia?