TL;DR
This paper introduces Mirror-BERT, a simple, fast, and self-supervised method to convert pretrained Masked Language Models into effective universal lexical and sentence encoders without additional data or supervision, achieving competitive results.
Contribution
The paper presents Mirror-BERT, a novel contrastive learning technique that transforms MLMs into universal encoders in seconds without external knowledge or task-specific fine-tuning.
Findings
Mirror-BERT significantly improves lexical and sentence similarity tasks.
It matches the performance of task-tuned Sentence-BERT models.
The method works across different domains and languages.
Abstract
Pretrained Masked Language Models (MLMs) have revolutionised NLP in recent years. However, previous work has indicated that off-the-shelf MLMs are not effective as universal lexical or sentence encoders without further task-specific fine-tuning on NLI, sentence similarity, or paraphrasing tasks using annotated task data. In this work, we demonstrate that it is possible to turn MLMs into effective universal lexical and sentence encoders even without any additional data and without any supervision. We propose an extremely simple, fast and effective contrastive learning technique, termed Mirror-BERT, which converts MLMs (e.g., BERT and RoBERTa) into such encoders in 20-30 seconds without any additional external knowledge. Mirror-BERT relies on fully identical or slightly modified string pairs as positive (i.e., synonymous) fine-tuning examples, and aims to maximise their similarity during…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗cambridgeltl/mirror-bert-base-uncased-sentence-dropheadmodel· 12 dl12 dl
- 🤗cambridgeltl/mirror-bert-base-uncased-sentencemodel· 8 dl8 dl
- 🤗cambridgeltl/mirror-bert-base-uncased-wordmodel· 3 dl· ♡ 13 dl♡ 1
- 🤗cambridgeltl/mirror-roberta-base-sentence-dropheadmodel· 15 dl· ♡ 115 dl♡ 1
- 🤗cambridgeltl/mirror-roberta-base-sentencemodel· 5 dl5 dl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Contrastive Learning · Mirror-BERT · Refunds@Expedia|||How do I get a full refund from Expedia? · Dropout · Adam · Dense Connections · Softmax
