Weakly-Supervised Concept-based Adversarial Learning for Cross-lingual   Word Embeddings

Haozhou Wang; James Henderson; Paola Merlo

arXiv:1904.09446·cs.CL·April 23, 2019·1 cites

Weakly-Supervised Concept-based Adversarial Learning for Cross-lingual Word Embeddings

Haozhou Wang, James Henderson, Paola Merlo

PDF

Open Access

TL;DR

This paper introduces a weakly-supervised adversarial learning approach that improves cross-lingual word embeddings by focusing on concept-level mappings, especially benefiting typologically distant language pairs.

Contribution

It proposes a novel concept-based adversarial training method that enhances alignment quality over previous unsupervised approaches, particularly for distant languages.

Findings

01

Improves cross-lingual embedding alignment for distant languages.

02

Outperforms previous unsupervised methods in concept mapping accuracy.

03

Enhances performance without relying on high-quality parallel data.

Abstract

Distributed representations of words which map each word to a continuous vector have proven useful in capturing important linguistic information not only in a single language but also across different languages. Current unsupervised adversarial approaches show that it is possible to build a mapping matrix that align two sets of monolingual word embeddings together without high quality parallel data such as a dictionary or a sentence-aligned corpus. However, without post refinement, the performance of these methods' preliminary mapping is not good, leading to poor performance for typologically distant languages. In this paper, we propose a weakly-supervised adversarial training method to overcome this limitation, based on the intuition that mapping across languages is better done at the concept level than at the word level. We propose a concept-based adversarial training method which…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Hate Speech and Cyberbullying Detection