Learning Probabilistic Sentence Representations from Paraphrases

Mingda Chen; Kevin Gimpel

arXiv:2005.08105·cs.CL·May 19, 2020

Learning Probabilistic Sentence Representations from Paraphrases

Mingda Chen, Kevin Gimpel

PDF

Open Access

TL;DR

This paper introduces probabilistic models for sentence representations trained on paraphrases, capturing sentence specificity and entailment, with the best model using linear transformations on Gaussian distributions.

Contribution

It proposes the first probabilistic sentence embedding models trained on paraphrases, capturing notions of specificity and entailment.

Findings

01

Best model uses linear transformations on Gaussian distributions.

02

Probabilistic models capture sentence entailment and specificity.

03

Simpler models also effectively represent specificity via vector norms.

Abstract

Probabilistic word embeddings have shown effectiveness in capturing notions of generality and entailment, but there is very little work on doing the analogous type of investigation for sentences. In this paper we define probabilistic models that produce distributions for sentences. Our best-performing model treats each word as a linear transformation operator applied to a multivariate Gaussian distribution. We train our models on paraphrases and demonstrate that they naturally capture sentence specificity. While our proposed model achieves the best performance overall, we also show that specificity is represented by simpler architectures via the norm of the sentence vectors. Qualitative analysis shows that our probabilistic model captures sentential entailment and provides ways to analyze the specificity and preciseness of individual words.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques