Unsupervised Hyperalignment for Multilingual Word Embeddings
Jean Alaux, Edouard Grave, Marco Cuturi, Armand Joulin

TL;DR
This paper introduces an unsupervised method for aligning multiple languages' word embeddings into a common space, improving indirect translation quality through a novel composable mapping approach.
Contribution
It extends unsupervised hyperalignment from two languages to multiple languages with a new formulation ensuring composability of mappings.
Findings
Improved indirect translation accuracy across eleven languages.
Maintained competitive performance on direct word translation.
Demonstrated the effectiveness of the composable mapping approach.
Abstract
We consider the problem of aligning continuous word representations, learned in multiple languages, to a common space. It was recently shown that, in the case of two languages, it is possible to learn such a mapping without supervision. This paper extends this line of work to the problem of aligning multiple languages to a common space. A solution is to independently map all languages to a pivot language. Unfortunately, this degrades the quality of indirect word translation. We thus propose a novel formulation that ensures composable mappings, leading to better alignments. We evaluate our method by jointly aligning word vectors in eleven languages, showing consistent improvement with indirect mappings while maintaining competitive performance on direct word translation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis
