Bilingual Embeddings with Random Walks over Multilingual Wordnets
J.Goikoetxea, A.Soroa, E.Agirre

TL;DR
This paper introduces a novel method for learning bilingual word embeddings using random walks over multilingual WordNets, which outperforms dictionary-based approaches and can be extended to other knowledge bases.
Contribution
The authors propose a new approach that leverages multilingual WordNets and random walks to learn bilingual embeddings in a single step, integrating cross-lingual constraints into the skipgram model.
Findings
Random walks over multilingual WordNets improve embedding quality.
Multilingual WordNets outperform text-only systems in similarity tasks.
Combining WordNets with text data yields the best results.
Abstract
Bilingual word embeddings represent words of two languages in the same space, and allow to transfer knowledge from one language to the other without machine translation. The main approach is to train monolingual embeddings first and then map them using bilingual dictionaries. In this work, we present a novel method to learn bilingual embeddings based on multilingual knowledge bases (KB) such as WordNet. Our method extracts bilingual information from multilingual wordnets via random walks and learns a joint embedding space in one go. We further reinforce cross-lingual equivalence adding bilingual con- straints in the loss function of the popular skipgram model. Our experiments involve twelve cross-lingual word similarity and relatedness datasets in six lan- guage pairs covering four languages, and show that: 1) random walks over mul- tilingual wordnets improve results over just using…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
