Robust Handling of Polysemy via Sparse Representations

Abhijit Mahabal; Dan Roth; Sid Mittal

arXiv:1805.07398·cs.CL·May 22, 2018

Robust Handling of Polysemy via Sparse Representations

Abhijit Mahabal, Dan Roth, Sid Mittal

PDF

TL;DR

This paper introduces Category Builder, a system using sparse distributed representations to better capture polysemy and multiple facets of words, outperforming dense models like Word2Vec in analogy tasks.

Contribution

It demonstrates that sparse representations are more effective than dense ones for modeling polysemy and multi-faceted lexical meanings, with a novel system called Category Builder.

Findings

01

Category Builder outperforms Word2Vec in certain analogy classes.

02

Sparse representations effectively capture multiple word facets.

03

The system supports complex meaning distinctions and set expansion tasks.

Abstract

Words are polysemous and multi-faceted, with many shades of meanings. We suggest that sparse distributed representations are more suitable than other, commonly used, (dense) representations to express these multiple facets, and present Category Builder, a working system that, as we show, makes use of sparse representations to support multi-faceted lexical representations. We argue that the set expansion task is well suited to study these meaning distinctions since a word may belong to multiple sets with a different reason for membership in each. We therefore exhibit the performance of Category Builder on this task, while showing that our representation captures at the same time analogy problems such as "the Ganga of Egypt" or "the Voldemort of Tolkien". Category Builder is shown to be a more expressive lexical representation and to outperform dense representations such as Word2Vec in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.