Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning
Sandeep Subramanian, Adam Trischler, Yoshua Bengio, Christopher J Pal

TL;DR
This paper introduces a multi-task learning framework for creating general-purpose sentence representations by training on diverse tasks and data sources, resulting in improved transfer learning performance.
Contribution
It proposes a simple, effective multi-task learning approach that combines multiple training objectives to learn high-quality sentence embeddings from large-scale data.
Findings
Consistent improvements over previous methods in transfer learning tasks.
Enhanced performance in low-resource NLP settings.
Effective sharing of a single encoder across diverse tasks.
Abstract
A lot of the recent success in natural language processing (NLP) has been driven by distributed vector representations of words trained on large amounts of text in an unsupervised manner. These representations are typically used as general purpose features for words across a range of NLP problems. However, extending this success to learning representations of sequences of words, such as sentences, remains an open problem. Recent work has explored unsupervised as well as supervised learning techniques with different training objectives to learn general purpose fixed-length sentence representations. In this work, we present a simple, effective multi-task learning framework for sentence representations that combines the inductive biases of diverse training objectives in a single model. We train this model on several data sources with multiple training objectives on over 100 million…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
