Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Nils Reimers, Iryna Gurevych

TL;DR
Sentence-BERT (SBERT) modifies BERT with siamese and triplet networks to generate efficient, high-quality sentence embeddings suitable for semantic similarity tasks, drastically reducing computation time while maintaining accuracy.
Contribution
The paper introduces SBERT, a novel BERT-based architecture that produces semantically meaningful sentence embeddings suitable for fast similarity search.
Findings
SBERT reduces similarity computation time from 65 hours to 5 seconds.
SBERT outperforms existing sentence embedding methods on STS and transfer learning tasks.
SBERT maintains BERT-level accuracy while enabling efficient semantic similarity search.
Abstract
BERT (Devlin et al., 2018) and RoBERTa (Liu et al., 2019) has set a new state-of-the-art performance on sentence-pair regression tasks like semantic textual similarity (STS). However, it requires that both sentences are fed into the network, which causes a massive computational overhead: Finding the most similar pair in a collection of 10,000 sentences requires about 50 million inference computations (~65 hours) with BERT. The construction of BERT makes it unsuitable for semantic similarity search as well as for unsupervised tasks like clustering. In this publication, we present Sentence-BERT (SBERT), a modification of the pretrained BERT network that use siamese and triplet network structures to derive semantically meaningful sentence embeddings that can be compared using cosine-similarity. This reduces the effort for finding the most similar pair from 65 hours with BERT / RoBERTa to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2model· 26.4M dl· ♡ 117826.4M dl♡ 1178
- 🤗lightonai/Reason-ModernColBERTmodel· 17k dl· ♡ 23717k dl♡ 237
- 🤗lightonai/LateOn-Code-edgemodel· 3.3k dl· ♡ 263.3k dl♡ 26
- 🤗VAGOsolutions/SauerkrautLM-Multi-ModernColBERTmodel· 165 dl· ♡ 10165 dl♡ 10
- 🤗sentence-transformers/clip-ViT-B-32-multilingual-v1model· 109k dl· ♡ 189109k dl♡ 189
- 🤗dangvantuan/vietnamese-embeddingmodel· 208k dl· ♡ 50208k dl♡ 50
- 🤗sentence-transformers/static-retrieval-mrl-en-v1model· ♡ 56♡ 56
- 🤗sentence-transformers/static-similarity-mrl-multilingual-v1model· ♡ 76♡ 76
- 🤗joe32140/ModernBERT-base-msmarcomodel· 2.4k dl· ♡ 112.4k dl♡ 11
- 🤗huyydangg/DEk21_hcmute_embeddingmodel· 202k dl· ♡ 34202k dl♡ 34
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Sentiment Analysis and Opinion Mining
MethodsSentence-BERT · Linear Layer · Residual Connection · Attention Dropout · Linear Warmup With Linear Decay · Weight Decay · RoBERTa · Refunds@Expedia|||How do I get a full refund from Expedia? · Dense Connections · Adam
