Unsupervised Alignment of Distributional Word Embeddings

Ai\"ssatou Diallo; Johannes F\"urnkranz

arXiv:2203.04863·cs.CL·September 22, 2022

Unsupervised Alignment of Distributional Word Embeddings

Ai\"ssatou Diallo, Johannes F\"urnkranz

PDF

Open Access

TL;DR

This paper introduces an unsupervised method for aligning probabilistic word embeddings, leveraging stochastic optimization to improve bilingual lexicon induction across multiple language pairs.

Contribution

It presents a novel stochastic optimization approach for aligning distributional (probabilistic) embeddings, extending beyond traditional point-vector methods.

Findings

01

Achieves competitive performance on bilingual lexicon induction

02

Outperforms point-vector based alignment methods

03

Effective across several language pairs

Abstract

Cross-domain alignment play a key roles in tasks ranging from machine translation to transfer learning. Recently, purely unsupervised methods operating on monolingual embeddings have successfully been used to infer a bilingual lexicon without relying on supervision. However, current state-of-the art methods only focus on point vectors although distributional embeddings have proven to embed richer semantic information when representing words. In this paper, we propose stochastic optimization approach for aligning probabilistic embeddings. Finally, we evaluate our method on the problem of unsupervised word translation, by aligning word embeddings trained on monolingual data. We show that the proposed approach achieves good performance on the bilingual lexicon induction task across several language pairs and performs better than the point-vector based approach.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification