Loss in Translation: Learning Bilingual Word Mapping with a Retrieval   Criterion

Armand Joulin; Piotr Bojanowski; Tomas Mikolov; Herve Jegou; Edouard; Grave

arXiv:1804.07745·cs.CL·September 6, 2018·19 cites

Loss in Translation: Learning Bilingual Word Mapping with a Retrieval Criterion

Armand Joulin, Piotr Bojanowski, Tomas Mikolov, Herve Jegou, Edouard, Grave

PDF

Open Access 4 Repos

TL;DR

This paper introduces an end-to-end method for aligning bilingual word embeddings by directly optimizing a retrieval criterion, leading to improved translation accuracy especially for distant language pairs.

Contribution

It presents a unified formulation that directly optimizes retrieval for bilingual word mapping, surpassing previous methods that relied on separate regression and retrieval steps.

Findings

01

Outperforms state-of-the-art on standard benchmarks

02

Significant improvements for distant language pairs like English-Chinese

03

Effective end-to-end training for bilingual embedding alignment

Abstract

Continuous word representations learned separately on distinct languages can be aligned so that their words become comparable in a common space. Existing works typically solve a least-square regression problem to learn a rotation aligning a small bilingual lexicon, and use a retrieval criterion for inference. In this paper, we propose an unified formulation that directly optimizes a retrieval criterion in an end-to-end fashion. Our experiments on standard benchmarks show that our approach outperforms the state of the art on word translation, with the biggest improvements observed for distant language pairs such as English-Chinese.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications