TL;DR
This paper introduces a bilingual deep generative transformer model that improves semantic sentence embeddings by leveraging parallel data and source separation, outperforming existing methods on semantic similarity tasks.
Contribution
It presents a novel variational probabilistic framework with transformers for source separation in bilingual sentence embeddings, enabling effective monolingual inference.
Findings
Outperforms state-of-the-art on semantic similarity benchmarks
Achieves significant gains on difficult evaluation subsets
Demonstrates effective source separation in bilingual sentence encoding
Abstract
Semantic sentence embedding models encode natural language sentences into vectors, such that closeness in embedding space indicates closeness in the semantics between the sentences. Bilingual data offers a useful signal for learning such embeddings: properties shared by both sentences in a translation pair are likely semantic, while divergent properties are likely stylistic or language-specific. We propose a deep latent variable model that attempts to perform source separation on parallel sentences, isolating what they have in common in a latent semantic vector, and explaining what is left over with language-specific latent vectors. Our proposed approach differs from past work on semantic sentence encoding in two ways. First, by using a variational probabilistic framework, we introduce priors that encourage source separation, and can use our model's posterior to predict sentence…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsTest
