Fuzzy paraphrases in learning word representations with a lexicon
Yuanzhi Ke, Masafumi Hagiwara

TL;DR
This paper introduces a novel method for learning word representations that uses a reliability-annotated paraphrase lexicon to mitigate polysemy issues, improving vector quality without multi-vector complexity.
Contribution
It proposes a new approach that diversely weights paraphrases based on reliability, enhancing word vectors by reducing noise from polysemous paraphrases.
Findings
Improved word vector quality demonstrated in experiments
Outperforms previous lexicon-based methods
Addresses polysemy without multi-vector representations
Abstract
A synonym of a polysemous word is usually only the paraphrase of one sense among many. When lexicons are used to improve vector-space word representations, such paraphrases are unreliable and bring noise to the vector-space. The prior works use a coefficient to adjust the overall learning of the lexicons. They regard the paraphrases equally. In this paper, we propose a novel approach that regards the paraphrases diversely to alleviate the adverse effects of polysemy. We annotate each paraphrase with a degree of reliability. The paraphrases are randomly eliminated according to the degrees when our model learns word representations. In this way, our approach drops the unreliable paraphrases, keeping more reliable paraphrases at the same time. The experimental results show that the proposed method improves the word vectors. Our approach is an attempt to address the polysemy problem keeping…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text and Document Classification Technologies
