A Bilingual Generative Transformer for Semantic Sentence Embedding

John Wieting; Graham Neubig; Taylor Berg-Kirkpatrick

arXiv:1911.03895·cs.CL·November 20, 2020

A Bilingual Generative Transformer for Semantic Sentence Embedding

John Wieting, Graham Neubig, Taylor Berg-Kirkpatrick

PDF

2 Repos

TL;DR

This paper introduces a bilingual deep generative transformer model that improves semantic sentence embeddings by leveraging parallel data and source separation, outperforming existing methods on semantic similarity tasks.

Contribution

It presents a novel variational probabilistic framework with transformers for source separation in bilingual sentence embeddings, enabling effective monolingual inference.

Findings

01

Outperforms state-of-the-art on semantic similarity benchmarks

02

Achieves significant gains on difficult evaluation subsets

03

Demonstrates effective source separation in bilingual sentence encoding

Abstract

Semantic sentence embedding models encode natural language sentences into vectors, such that closeness in embedding space indicates closeness in the semantics between the sentences. Bilingual data offers a useful signal for learning such embeddings: properties shared by both sentences in a translation pair are likely semantic, while divergent properties are likely stylistic or language-specific. We propose a deep latent variable model that attempts to perform source separation on parallel sentences, isolating what they have in common in a latent semantic vector, and explaining what is left over with language-specific latent vectors. Our proposed approach differs from past work on semantic sentence encoding in two ways. First, by using a variational probabilistic framework, we introduce priors that encourage source separation, and can use our model's posterior to predict sentence…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsTest