Beyond Contrastive Learning: A Variational Generative Model for Multilingual Retrieval
John Wieting, Jonathan H. Clark, William W. Cohen, Graham Neubig, and, Taylor Berg-Kirkpatrick

TL;DR
This paper introduces a variational generative model for multilingual text embeddings that effectively separates shared semantic content from language-specific variations, outperforming contrastive methods in various retrieval tasks.
Contribution
It proposes a novel variational generative approach for multilingual embeddings that enhances source separation and demonstrates superior performance over contrastive methods.
Findings
Outperforms contrastive and generative baselines on multiple tasks.
Introduces a new cross-lingual question retrieval task.
Shows efficient source separation in multilingual embeddings.
Abstract
Contrastive learning has been successfully used for retrieval of semantically aligned sentences, but it often requires large batch sizes or careful engineering to work well. In this paper, we instead propose a generative model for learning multilingual text embeddings which can be used to retrieve or score sentence pairs. Our model operates on parallel data in languages and, through an approximation we introduce, efficiently encourages source separation in this multilingual setting, separating semantic information that is shared between translations from stylistic or language-specific variation. We show careful large-scale comparisons between contrastive and generation-based approaches for learning multilingual text embeddings, a comparison that has not been done to the best of our knowledge despite the popularity of these approaches. We evaluate this method on a suite of tasks…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text and Document Classification Technologies
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Byte Pair Encoding · Dense Connections · Residual Connection · Label Smoothing · Position-Wise Feed-Forward Layer · Dropout · Absolute Position Encodings
