Meemi: A Simple Method for Post-processing and Integrating Cross-lingual Word Embeddings
Yerai Doval, Jose Camacho-Collados, Luis Espinosa-Anke, Steven, Schockaert

TL;DR
Meemi introduces a non-orthogonal post-processing step for cross-lingual word embeddings that enhances the alignment and quality of multilingual spaces, leading to improved performance in various NLP tasks.
Contribution
The paper proposes a novel non-orthogonal transformation method for post-processing cross-lingual embeddings, improving multilingual integration and monolingual space quality.
Findings
Enhanced cross-lingual embedding alignment
Improved performance on dictionary induction and word similarity tasks
Better results in cross-lingual NLP applications
Abstract
Word embeddings have become a standard resource in the toolset of any Natural Language Processing practitioner. While monolingual word embeddings encode information about words in the context of a particular language, cross-lingual embeddings define a multilingual space where word embeddings from two or more languages are integrated together. Current state-of-the-art approaches learn these embeddings by aligning two disjoint monolingual vector spaces through an orthogonal transformation which preserves the structure of the monolingual counterparts. In this work, we propose to apply an additional transformation after this initial alignment step, which aims to bring the vector representations of a given word and its translations closer to their average. Since this additional transformation is non-orthogonal, it also affects the structure of the monolingual spaces. We show that our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
