CLUSE: Cross-Lingual Unsupervised Sense Embeddings

Ta-Chung Chi; Yun-Nung Chen

arXiv:1809.05694·cs.CL·October 23, 2018

CLUSE: Cross-Lingual Unsupervised Sense Embeddings

Ta-Chung Chi, Yun-Nung Chen

PDF

Open Access 1 Repo

TL;DR

This paper introduces CLUSE, a modular model for learning bilingual sense embeddings that are well-aligned across languages, using parallel corpora and evaluated on new and existing datasets to demonstrate superior quality.

Contribution

The paper presents a novel modular approach for jointly learning bilingual sense embeddings aligned in a shared vector space, and introduces BCWS, a new dataset for cross-lingual evaluation.

Findings

01

Sense embeddings are effectively aligned in bilingual space.

02

The model outperforms existing methods on monolingual and bilingual evaluations.

03

BCWS dataset provides a new benchmark for cross-lingual sense similarity.

Abstract

This paper proposes a modularized sense induction and representation learning model that jointly learns bilingual sense embeddings that align well in the vector space, where the cross-lingual signal in the English-Chinese parallel corpus is exploited to capture the collocation and distributed characteristics in the language pair. The model is evaluated on the Stanford Contextual Word Similarity (SCWS) dataset to ensure the quality of monolingual sense embeddings. In addition, we introduce Bilingual Contextual Word Similarity (BCWS), a large and high-quality dataset for evaluating cross-lingual sense embeddings, which is the first attempt of measuring whether the learned embeddings are indeed aligned well in the vector space. The proposed approach shows the superior quality of sense embeddings evaluated in both monolingual and bilingual spaces.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

MiuLab/CLUSE
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems