ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer
Yuanmeng Yan, Rumei Li, Sirui Wang, Fuzheng Zhang, Wei Wu, Weiran, Xu

TL;DR
ConSERT introduces a contrastive learning framework to fine-tune BERT for better unsupervised sentence representations, significantly improving performance on semantic textual similarity tasks and demonstrating robustness with limited data.
Contribution
The paper presents ConSERT, a novel contrastive learning approach that enhances BERT-derived sentence representations without supervision, addressing the collapse issue and improving downstream task performance.
Findings
Achieves 8% relative improvement over previous state-of-the-art on STS datasets.
Attains new state-of-the-art performance when incorporating NLI supervision.
Performs well with only 1000 samples, showing robustness in data-scarce scenarios.
Abstract
Learning high-quality sentence representations benefits a wide range of natural language processing tasks. Though BERT-based pre-trained language models achieve high performance on many downstream tasks, the native derived sentence representations are proved to be collapsed and thus produce a poor performance on the semantic textual similarity (STS) tasks. In this paper, we present ConSERT, a Contrastive Framework for Self-Supervised Sentence Representation Transfer, that adopts contrastive learning to fine-tune BERT in an unsupervised and effective way. By making use of unlabeled texts, ConSERT solves the collapse issue of BERT-derived sentence representations and make them more applicable for downstream tasks. Experiments on STS datasets demonstrate that ConSERT achieves an 8\% relative improvement over the previous state-of-the-art, even comparable to the supervised SBERT-NLI. And…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
MethodsAttention Is All You Need · Linear Layer · Contrastive Learning · WordPiece · Softmax · Layer Normalization · Dropout · Attention Dropout · Residual Connection · Refunds@Expedia|||How do I get a full refund from Expedia?
