A Novel Bilingual Word Embedding Method for Lexical Translation Using Bilingual Sense Clique
Rui Wang, Hai Zhao, Sabine Ploux, Bao-Liang Lu, Masao Utiyama and, Eiichiro Sumita

TL;DR
This paper introduces a new bilingual sense clique approach for word embedding that captures deeper semantic relationships, improving translation accuracy without needing separate bilingual projections.
Contribution
It proposes a latent bilingual sense clique derived from maximal complete sub-graphs, enabling more effective bilingual embeddings without separate projection steps.
Findings
Outperforms existing bilingual word embedding methods in lexicon translation tasks.
Uses maximum complete sub-graphs of PMI-based graphs to derive bilingual sense units.
Demonstrates effectiveness across multiple dimension reduction techniques.
Abstract
Most of the existing methods for bilingual word embedding only consider shallow context or simple co-occurrence information. In this paper, we propose a latent bilingual sense unit (Bilingual Sense Clique, BSC), which is derived from a maximum complete sub-graph of pointwise mutual information based graph over bilingual corpus. In this way, we treat source and target words equally and a separated bilingual projection processing that have to be used in most existing works is not necessary any more. Several dimension reduction methods are evaluated to summarize the BSC-word relationship. The proposed method is evaluated on bilingual lexicon translation tasks and empirical results show that bilingual sense embedding methods outperform existing bilingual word embedding methods.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text and Document Classification Technologies
