Unsupervised Alignment of Distributional Word Embeddings
Ai\"ssatou Diallo, Johannes F\"urnkranz

TL;DR
This paper introduces an unsupervised method for aligning probabilistic word embeddings, leveraging stochastic optimization to improve bilingual lexicon induction across multiple language pairs.
Contribution
It presents a novel stochastic optimization approach for aligning distributional (probabilistic) embeddings, extending beyond traditional point-vector methods.
Findings
Achieves competitive performance on bilingual lexicon induction
Outperforms point-vector based alignment methods
Effective across several language pairs
Abstract
Cross-domain alignment play a key roles in tasks ranging from machine translation to transfer learning. Recently, purely unsupervised methods operating on monolingual embeddings have successfully been used to infer a bilingual lexicon without relying on supervision. However, current state-of-the art methods only focus on point vectors although distributional embeddings have proven to embed richer semantic information when representing words. In this paper, we propose stochastic optimization approach for aligning probabilistic embeddings. Finally, we evaluate our method on the problem of unsupervised word translation, by aligning word embeddings trained on monolingual data. We show that the proposed approach achieves good performance on the bilingual lexicon induction task across several language pairs and performs better than the point-vector based approach.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
