Speeding up Context-based Sentence Representation Learning with Non-autoregressive Convolutional Decoding
Shuai Tang, Hailin Jin, Chen Fang, Zhaowen Wang, Virginia R. de Sa

TL;DR
This paper introduces a fast, efficient, non-autoregressive convolutional decoder for unsupervised sentence representation learning, demonstrating improved performance and transferability across NLP tasks.
Contribution
It proposes a novel asymmetric encoder-decoder model with a non-autoregressive convolutional decoder, eliminating the need for autoregressive or RNN decoders in sentence representation learning.
Findings
The model is faster and simpler than traditional autoregressive models.
It achieves better performance on downstream NLP tasks.
The approach is effective across different large unlabelled corpora.
Abstract
Context plays an important role in human language understanding, thus it may also be useful for machines learning vector representations of language. In this paper, we explore an asymmetric encoder-decoder structure for unsupervised context-based sentence representation learning. We carefully designed experiments to show that neither an autoregressive decoder nor an RNN decoder is required. After that, we designed a model which still keeps an RNN as the encoder, while using a non-autoregressive convolutional decoder. We further combine a suite of effective designs to significantly improve model efficiency while also achieving better performance. Our model is trained on two different large unlabelled corpora, and in both cases the transferability is evaluated on a set of downstream NLP tasks. We empirically show that our model is simple and fast while producing rich sentence…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
