Word sense induction using word embeddings and community detection in   complex networks

Edilson A. Corr\^ea Jr.; Diego R. Amancio

arXiv:1803.08476·cs.CL·March 6, 2019

Word sense induction using word embeddings and community detection in complex networks

Edilson A. Corr\^ea Jr., Diego R. Amancio

PDF

TL;DR

This paper presents an unsupervised method for word sense induction that uses word embeddings and community detection in complex networks, outperforming existing approaches without relying on structured knowledge sources.

Contribution

The authors introduce a novel unsupervised approach combining context embeddings and community detection for WSI, eliminating the need for domain-specific knowledge.

Findings

01

Outperforms existing WSI algorithms and baselines

02

Effective in inducing multiple senses from corpora

03

Operates without structured external knowledge

Abstract

Word Sense Induction (WSI) is the ability to automatically induce word senses from corpora. The WSI task was first proposed to overcome the limitations of manually annotated corpus that are required in word sense disambiguation systems. Even though several works have been proposed to induce word senses, existing systems are still very limited in the sense that they make use of structured, domain-specific knowledge sources. In this paper, we devise a method that leverages recent findings in word embeddings research to generate context embeddings, which are embeddings containing information about the semantical context of a word. In order to induce senses, we modeled the set of ambiguous words as a complex network. In the generated network, two instances (nodes) are connected if the respective context embeddings are similar. Upon using well-established community detection methods to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.