Not All Neural Embeddings are Born Equal
Felix Hill, KyungHyun Cho, Sebastien Jean, Coline Devin, Yoshua, Bengio

TL;DR
This paper compares neural embeddings from translation and monolingual models, showing translation-based embeddings better capture conceptual and syntactic information, revealing differences in how models learn linguistic knowledge.
Contribution
It demonstrates that neural translation models produce embeddings that more accurately reflect conceptual and ontological relationships than monolingual models.
Findings
Translation-based embeddings outperform monolingual ones in single-language tasks.
Neural translation models better capture the ontological status of concepts.
Monolingual models learn about concept relations, but translation models encode deeper conceptual information.
Abstract
Neural language models learn word representations that capture rich linguistic and conceptual information. Here we investigate the embeddings learned by neural machine translation models. We show that translation-based embeddings outperform those learned by cutting-edge monolingual models at single-language tasks requiring knowledge of conceptual similarity and/or syntactic role. The findings suggest that, while monolingual models learn information about how concepts are related, neural-translation models better capture their true ontological status.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification
