Meemi: A Simple Method for Post-processing and Integrating Cross-lingual   Word Embeddings

Yerai Doval; Jose Camacho-Collados; Luis Espinosa-Anke; Steven; Schockaert

arXiv:1910.07221·cs.CL·November 12, 2020

Meemi: A Simple Method for Post-processing and Integrating Cross-lingual Word Embeddings

Yerai Doval, Jose Camacho-Collados, Luis Espinosa-Anke, Steven, Schockaert

PDF

TL;DR

Meemi introduces a non-orthogonal post-processing step for cross-lingual word embeddings that enhances the alignment and quality of multilingual spaces, leading to improved performance in various NLP tasks.

Contribution

The paper proposes a novel non-orthogonal transformation method for post-processing cross-lingual embeddings, improving multilingual integration and monolingual space quality.

Findings

01

Enhanced cross-lingual embedding alignment

02

Improved performance on dictionary induction and word similarity tasks

03

Better results in cross-lingual NLP applications

Abstract

Word embeddings have become a standard resource in the toolset of any Natural Language Processing practitioner. While monolingual word embeddings encode information about words in the context of a particular language, cross-lingual embeddings define a multilingual space where word embeddings from two or more languages are integrated together. Current state-of-the-art approaches learn these embeddings by aligning two disjoint monolingual vector spaces through an orthogonal transformation which preserves the structure of the monolingual counterparts. In this work, we propose to apply an additional transformation after this initial alignment step, which aims to bring the vector representations of a given word and its translations closer to their average. Since this additional transformation is non-orthogonal, it also affects the structure of the monolingual spaces. We show that our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.