Unsupervised Word Mapping Using Structural Similarities in Monolingual   Embeddings

Hanan Aldarmaki; Mahesh Mohan; Mona Diab

arXiv:1712.06961·cs.CL·March 26, 2018

Unsupervised Word Mapping Using Structural Similarities in Monolingual Embeddings

Hanan Aldarmaki, Mahesh Mohan, Mona Diab

PDF

TL;DR

This paper introduces an unsupervised method for aligning monolingual word embeddings to induce bilingual dictionaries without relying on parallel data or seed dictionaries, leveraging structural similarities in the embeddings.

Contribution

It presents a novel unsupervised approach that exploits local and global structural similarities in monolingual embeddings for bilingual dictionary induction.

Findings

01

Performance comparable to supervised methods

02

Effective without prior alignments or seed dictionaries

03

Applicable to language pairs lacking parallel corpora

Abstract

Most existing methods for automatic bilingual dictionary induction rely on prior alignments between the source and target languages, such as parallel corpora or seed dictionaries. For many language pairs, such supervised alignments are not readily available. We propose an unsupervised approach for learning a bilingual dictionary for a pair of languages given their independently-learned monolingual word embeddings. The proposed method exploits local and global structures in monolingual vector spaces to align them such that similar words are mapped to each other. We show empirically that the performance of bilingual correspondents learned using our proposed unsupervised method is comparable to that of using supervised bilingual correspondents from a seed dictionary.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.