TSDAE: Using Transformer-based Sequential Denoising Auto-Encoder for Unsupervised Sentence Embedding Learning
Kexin Wang, Nils Reimers, Iryna Gurevych

TL;DR
TSDAE introduces a transformer-based unsupervised method for sentence embedding that outperforms previous models and generalizes well across multiple domains, reducing the need for labeled data.
Contribution
The paper presents TSDAE, a novel unsupervised transformer-based auto-encoder that achieves state-of-the-art results and demonstrates strong domain adaptation capabilities.
Findings
Outperforms previous approaches by up to 6.4 points
Achieves up to 93.1% of supervised performance
Excels in diverse domain evaluations
Abstract
Learning sentence embeddings often requires a large amount of labeled data. However, for most tasks and domains, labeled data is seldom available and creating it is expensive. In this work, we present a new state-of-the-art unsupervised method based on pre-trained Transformers and Sequential Denoising Auto-Encoder (TSDAE) which outperforms previous approaches by up to 6.4 points. It can achieve up to 93.1% of the performance of in-domain supervised approaches. Further, we show that TSDAE is a strong domain adaptation and pre-training method for sentence embeddings, significantly outperforming other approaches like Masked Language Model. A crucial shortcoming of previous studies is the narrow evaluation: Most work mainly evaluates on the single task of Semantic Textual Similarity (STS), which does not require any domain knowledge. It is unclear if these proposed methods generalize to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗kwang2049/TSDAE-askubuntumodel· 1 dl1 dl
- 🤗kwang2049/TSDAE-askubuntu2nli_stsbmodel· 1 dl1 dl
- 🤗kwang2049/TSDAE-cqadupstackmodel· 1 dl1 dl
- 🤗kwang2049/TSDAE-cqadupstack2nli_stsbmodel
- 🤗kwang2049/TSDAE-scidocsmodel· 7 dl7 dl
- 🤗kwang2049/TSDAE-scidocs2nli_stsbmodel· 1 dl1 dl
- 🤗kwang2049/TSDAE-twitterparamodel· 2 dl· ♡ 12 dl♡ 1
- 🤗kwang2049/TSDAE-twitterpara2nli_stsbmodel· 1 dl1 dl
- 🤗smartmind/roberta-ko-small-tsdaemodel· 7 dl· ♡ 27 dl♡ 2
- 🤗lengocduc195/SentenceTransformermodel
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Sentiment Analysis and Opinion Mining
MethodsAttention Is All You Need · Softmax · TSDAE
