Loss in Translation: Learning Bilingual Word Mapping with a Retrieval Criterion
Armand Joulin, Piotr Bojanowski, Tomas Mikolov, Herve Jegou, Edouard, Grave

TL;DR
This paper introduces an end-to-end method for aligning bilingual word embeddings by directly optimizing a retrieval criterion, leading to improved translation accuracy especially for distant language pairs.
Contribution
It presents a unified formulation that directly optimizes retrieval for bilingual word mapping, surpassing previous methods that relied on separate regression and retrieval steps.
Findings
Outperforms state-of-the-art on standard benchmarks
Significant improvements for distant language pairs like English-Chinese
Effective end-to-end training for bilingual embedding alignment
Abstract
Continuous word representations learned separately on distinct languages can be aligned so that their words become comparable in a common space. Existing works typically solve a least-square regression problem to learn a rotation aligning a small bilingual lexicon, and use a retrieval criterion for inference. In this paper, we propose an unified formulation that directly optimizes a retrieval criterion in an end-to-end fashion. Our experiments on standard benchmarks show that our approach outperforms the state of the art on word translation, with the biggest improvements observed for distant language pairs such as English-Chinese.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
