Learning aligned embeddings for semi-supervised word translation using   Maximum Mean Discrepancy

Antonio H. O. Fonseca; David van Dijk

arXiv:2006.11578·cs.CL·June 23, 2020

Learning aligned embeddings for semi-supervised word translation using Maximum Mean Discrepancy

Antonio H. O. Fonseca, David van Dijk

PDF

Open Access

TL;DR

This paper introduces WAM, an end-to-end method for aligning word embeddings across languages during translation training using MMD, outperforming existing supervised and unsupervised approaches.

Contribution

The paper presents a novel unsupervised approach for word embedding alignment that does not require known word pairs, using a localized MMD constraint during translation training.

Findings

01

WAM outperforms existing unsupervised methods.

02

WAM surpasses supervised methods trained on known translations.

03

Effective alignment during sentence translation training.

Abstract

Word translation is an integral part of language translation. In machine translation, each language is considered a domain with its own word embedding. The alignment between word embeddings allows linking semantically equivalent words in multilingual contexts. Moreover, it offers a way to infer cross-lingual meaning for words without a direct translation. Current methods for word embedding alignment are either supervised, i.e. they require known word pairs, or learn a cross-domain transformation on fixed embeddings in an unsupervised way. Here we propose an end-to-end approach for word embedding alignment that does not require known word pairs. Our method, termed Word Alignment through MMD (WAM), learns embeddings that are aligned during sentence translation training using a localized Maximum Mean Discrepancy (MMD) constraint between the embeddings. We show that our method not only…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification