Fuzzy paraphrases in learning word representations with a lexicon

Yuanzhi Ke; Masafumi Hagiwara

arXiv:1611.00674·cs.CL·September 11, 2017

Fuzzy paraphrases in learning word representations with a lexicon

Yuanzhi Ke, Masafumi Hagiwara

PDF

Open Access

TL;DR

This paper introduces a novel method for learning word representations that uses a reliability-annotated paraphrase lexicon to mitigate polysemy issues, improving vector quality without multi-vector complexity.

Contribution

It proposes a new approach that diversely weights paraphrases based on reliability, enhancing word vectors by reducing noise from polysemous paraphrases.

Findings

01

Improved word vector quality demonstrated in experiments

02

Outperforms previous lexicon-based methods

03

Addresses polysemy without multi-vector representations

Abstract

A synonym of a polysemous word is usually only the paraphrase of one sense among many. When lexicons are used to improve vector-space word representations, such paraphrases are unreliable and bring noise to the vector-space. The prior works use a coefficient to adjust the overall learning of the lexicons. They regard the paraphrases equally. In this paper, we propose a novel approach that regards the paraphrases diversely to alleviate the adverse effects of polysemy. We annotate each paraphrase with a degree of reliability. The paraphrases are randomly eliminated according to the degrees when our model learns word representations. In this way, our approach drops the unreliable paraphrases, keeping more reliable paraphrases at the same time. The experimental results show that the proposed method improves the word vectors. Our approach is an attempt to address the polysemy problem keeping…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text and Document Classification Technologies