miCSE: Mutual Information Contrastive Learning for Low-shot Sentence Embeddings
Tassilo Klein, Moin Nabi

TL;DR
miCSE introduces a mutual information contrastive learning framework that enhances few-shot sentence embedding by enforcing structural consistency across augmented views, leading to improved sample efficiency and state-of-the-art results in few-shot scenarios.
Contribution
The paper proposes miCSE, a novel mutual information-based contrastive learning method that improves few-shot sentence embedding by aligning attention patterns and structural consistency.
Findings
Achieves state-of-the-art performance in few-shot sentence embedding tasks.
Shows strong sample efficiency compared to existing methods.
Performs comparably to full-shot methods in full data scenarios.
Abstract
This paper presents miCSE, a mutual information-based contrastive learning framework that significantly advances the state-of-the-art in few-shot sentence embedding. The proposed approach imposes alignment between the attention pattern of different views during contrastive learning. Learning sentence embeddings with miCSE entails enforcing the structural consistency across augmented views for every sentence, making contrastive self-supervised learning more sample efficient. As a result, the proposed approach shows strong performance in the few-shot learning domain. While it achieves superior results compared to state-of-the-art methods on multiple benchmarks in few-shot learning, it is comparable in the full-shot scenario. This study opens up avenues for efficient self-supervised learning methods that are more robust than current contrastive methods for sentence embedding.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Topic Modeling
MethodsContrastive Learning
