Word Sense Disambiguation using a Bidirectional LSTM

Mikael K{\aa}geb\"ack; Hans Salomonsson

arXiv:1606.03568·cs.CL·November 22, 2016·38 cites

Word Sense Disambiguation using a Bidirectional LSTM

Mikael K{\aa}geb\"ack, Hans Salomonsson

PDF

Open Access 1 Repo

TL;DR

This paper introduces a bidirectional LSTM model for word sense disambiguation that is simple, scalable, and achieves state-of-the-art results without external resources or handcrafted features.

Contribution

The authors propose a shared bidirectional LSTM model trained end-to-end for WSD, demonstrating competitive performance without external knowledge or language-specific features.

Findings

01

Achieves statistically equivalent results to state-of-the-art systems.

02

Uses no external resources or handcrafted rules.

03

Scales well with vocabulary size.

Abstract

In this paper we present a clean, yet effective, model for word sense disambiguation. Our approach leverage a bidirectional long short-term memory network which is shared between all words. This enables the model to share statistical strength and to scale well with vocabulary size. The model is trained end-to-end, directly from the raw text to sense labels, and makes effective use of word order. We evaluate our approach on two standard datasets, using identical hyperparameter settings, which are in turn tuned on a third set of held out data. We employ no external resources (e.g. knowledge graphs, part-of-speech tagging, etc), language specific features, or hand crafted rules, but still achieve statistically equivalent results to the best state-of-the-art systems, that employ no such limitations.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://bitbucket.org/salomons/wsd
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems