CoT-BERT: Enhancing Unsupervised Sentence Representation through Chain-of-Thought
Bowen Zhang, Kehua Chang, Chunping Li

TL;DR
CoT-BERT introduces a novel unsupervised sentence representation method that leverages Chain-of-Thought reasoning, an advanced contrastive loss, and template denoising to improve embedding quality without external components.
Contribution
This paper proposes CoT-BERT, a new approach that enhances unsupervised sentence embeddings by integrating Chain-of-Thought reasoning with improved contrastive learning and denoising strategies, avoiding external modules.
Findings
Outperforms established baselines in unsupervised sentence representation tasks.
Uses only pre-trained models without external components.
Demonstrates significant improvements through rigorous experiments.
Abstract
Unsupervised sentence representation learning aims to transform input sentences into fixed-length vectors enriched with intricate semantic information while obviating the reliance on labeled data. Recent strides within this domain have been significantly propelled by breakthroughs in contrastive learning and prompt engineering. Despite these advancements, the field has reached a plateau, leading some researchers to incorporate external components to enhance the quality of sentence embeddings. Such integration, though beneficial, complicates solutions and inflates demands for computational resources. In response to these challenges, this paper presents CoT-BERT, an innovative method that harnesses the progressive thinking of Chain-of-Thought reasoning to tap into the latent potential of pre-trained models like BERT. Additionally, we develop an advanced contrastive learning loss function…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
MethodsInfoNCE · Attention Is All You Need · Residual Connection · Adam · Weight Decay · Dropout · Linear Layer · Layer Normalization · WordPiece · Multi-Head Attention
