Improve Lexicon-based Word Embeddings By Word Sense Disambiguation
Yuanzhi Ke, Masafumi Hagiwara

TL;DR
This paper introduces a novel lexicon-based word embedding method that incorporates word sense disambiguation to improve embeddings for polysemous words and enhances performance in text classification tasks.
Contribution
It proposes a new approach that considers the relatedness and differences between corpus and lexicon, using sense disambiguation to refine embeddings for polysemous words.
Findings
Improved embeddings for polysemous words.
Enhanced text classification performance.
Outperformed prior methods in word similarity tasks.
Abstract
There have been some works that learn a lexicon together with the corpus to improve the word embeddings. However, they either model the lexicon separately but update the neural networks for both the corpus and the lexicon by the same likelihood, or minimize the distance between all of the synonym pairs in the lexicon. Such methods do not consider the relatedness and difference of the corpus and the lexicon, and may not be the best optimized. In this paper, we propose a novel method that considers the relatedness and difference of the corpus and the lexicon. It trains word embeddings by learning the corpus to predicate a word and its corresponding synonym under the context at the same time. For polysemous words, we use a word sense disambiguation filter to eliminate the synonyms that have different meanings for the context. To evaluate the proposed method, we compare the performance of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text and Document Classification Technologies
