Cross-Lingual Word Embedding Refinement by $\ell_{1}$ Norm Optimisation

Xutan Peng; Chenghua Lin; Mark Stevenson

arXiv:2104.04916·cs.CL·January 25, 2022

Cross-Lingual Word Embedding Refinement by $\ell_{1}$ Norm Optimisation

Xutan Peng, Chenghua Lin, Mark Stevenson

PDF

1 Repo

TL;DR

This paper introduces a robust post-processing method for cross-lingual word embeddings using $ ext{l}_1$ norm optimization, significantly improving their quality across multiple languages and tasks.

Contribution

It proposes a novel $ ext{l}_1$ norm-based refinement technique that is agnostic to the original training process of CLWEs, enhancing their robustness and performance.

Findings

01

Outperforms four state-of-the-art baselines in bilingual lexicon induction.

02

Improves cross-lingual transfer for natural language inference tasks.

03

Effective across ten diverse languages and various training corpora.

Abstract

Cross-Lingual Word Embeddings (CLWEs) encode words from two or more languages in a shared high-dimensional space in which vectors representing words with similar meaning (regardless of language) are closely located. Existing methods for building high-quality CLWEs learn mappings that minimise the $ℓ_{2}$ norm loss function. However, this optimisation objective has been demonstrated to be sensitive to outliers. Based on the more robust Manhattan norm (aka. $ℓ_{1}$ norm) goodness-of-fit criterion, this paper proposes a simple post-processing step to improve CLWEs. An advantage of this approach is that it is fully agnostic to the training process of the original CLWEs and can therefore be applied widely. Extensive experiments are performed involving ten diverse languages and embeddings trained on different corpora. Evaluation results based on bilingual lexicon induction and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Pzoom522/L1-Refinement
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.