Embedding Word Similarity with Neural Machine Translation
Felix Hill, Kyunghyun Cho, Sebastien Jean, Coline Devin, Yoshua, Bengio

TL;DR
This paper demonstrates that neural machine translation models produce word embeddings that outperform monolingual models in capturing conceptual similarity and lexical roles, with robustness across languages and scalable training methods.
Contribution
The study shows that translation-based embeddings outperform monolingual ones in key linguistic tasks and introduces a scalable training method for large vocabularies.
Findings
Translation embeddings outperform monolingual embeddings in similarity tasks.
Embeddings are consistent across English-French and English-German translations.
Vocabulary expansion minimally affects embedding quality.
Abstract
Neural language models learn word representations, or embeddings, that capture rich linguistic and conceptual information. Here we investigate the embeddings learned by neural machine translation models, a recently-developed class of neural language model. We show that embeddings from translation models outperform those learned by monolingual models at tasks that require knowledge of both conceptual similarity and lexical-syntactic role. We further show that these effects hold when translating from both English to French and English to German, and argue that the desirable properties of translation embeddings should emerge largely independently of the source and target languages. Finally, we apply a new method for training neural translation models with very large vocabularies, and show that this vocabulary expansion algorithm results in minimal degradation of embedding quality. Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
