Towards Universal Paraphrastic Sentence Embeddings

John Wieting; Mohit Bansal; Kevin Gimpel; Karen Livescu

arXiv:1511.08198·cs.CL·March 7, 2016·ICLR·117 cites

Towards Universal Paraphrastic Sentence Embeddings

John Wieting, Mohit Bansal, Kevin Gimpel, Karen Livescu

PDF

Open Access

TL;DR

This paper evaluates various architectures for learning universal paraphrastic sentence embeddings, finding simple averaging models effective across domains and demonstrating their utility in multiple NLP tasks, with resources released for community use.

Contribution

It compares six compositional architectures for sentence embeddings, highlighting the efficiency and effectiveness of simple averaging models across diverse NLP tasks and domains.

Findings

01

Simple averaging models outperform LSTMs out-of-domain.

02

LSTMs achieve state-of-the-art on sentiment classification.

03

Pretrained embeddings improve performance on similarity and entailment.

Abstract

We consider the problem of learning general-purpose, paraphrastic sentence embeddings based on supervision from the Paraphrase Database (Ganitkevitch et al., 2013). We compare six compositional architectures, evaluating them on annotated textual similarity datasets drawn both from the same distribution as the training data and from a wide range of other domains. We find that the most complex architectures, such as long short-term memory (LSTM) recurrent neural networks, perform best on the in-domain data. However, in out-of-domain scenarios, simple architectures such as word averaging vastly outperform LSTMs. Our simplest averaging model is even competitive with systems tuned for the particular tasks while also being extremely efficient and easy to use. In order to better understand how these architectures compare, we conduct further experiments on three supervised NLP tasks: sentence…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications

MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory