DebCSE: Rethinking Unsupervised Contrastive Sentence Embedding Learning in the Debiasing Perspective
Pu Miao, Zeyao Du, Junlin Zhang

TL;DR
DebCSE introduces a debiasing contrastive learning framework for sentence embeddings that reduces biases like sentence length and false negatives, leading to improved semantic similarity performance.
Contribution
The paper proposes DebCSE, a novel contrastive learning method that employs inverse propensity weighted sampling to eliminate multiple biases in unsupervised sentence embedding.
Findings
DebCSE outperforms state-of-the-art models on STS benchmarks.
It achieves an average Spearman's correlation of 80.33% on BERTbase.
The method effectively reduces biases in contrastive learning for sentence embeddings.
Abstract
Several prior studies have suggested that word frequency biases can cause the Bert model to learn indistinguishable sentence embeddings. Contrastive learning schemes such as SimCSE and ConSERT have already been adopted successfully in unsupervised sentence embedding to improve the quality of embeddings by reducing this bias. However, these methods still introduce new biases such as sentence length bias and false negative sample bias, that hinders model's ability to learn more fine-grained semantics. In this paper, we reexamine the challenges of contrastive sentence embedding learning from a debiasing perspective and argue that effectively eliminating the influence of various biases is crucial for learning high-quality sentence embeddings. We think all those biases are introduced by simple rules for constructing training data in contrastive learning and the key for contrastive learning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Attention Dropout · Residual Connection · Adam · Weight Decay · Softmax · Refunds@Expedia|||How do I get a full refund from Expedia? · Linear Warmup With Linear Decay
