Word Sense Disambiguation with LSTM: Do We Really Need 100 Billion   Words?

Minh Le; Marten Postma; Jacopo Urbani

arXiv:1712.03376·cs.CL·December 19, 2017·6 cites

Word Sense Disambiguation with LSTM: Do We Really Need 100 Billion Words?

Minh Le, Marten Postma, Jacopo Urbani

PDF

Open Access

TL;DR

This study reproduces a previous LSTM-based WSD approach using open datasets and software, demonstrating comparable results with significantly less data than originally claimed.

Contribution

It provides a reproducibility analysis of LSTM-based WSD, showing that high performance does not require extremely large datasets.

Findings

01

State-of-the-art results achieved with less data

02

Open-source code and models released for community use

03

Reproduction confirms effectiveness of LSTM for WSD

Abstract

Recently, Yuan et al. (2016) have shown the effectiveness of using Long Short-Term Memory (LSTM) for performing Word Sense Disambiguation (WSD). Their proposed technique outperformed the previous state-of-the-art with several benchmarks, but neither the training data nor the source code was released. This paper presents the results of a reproduction study of this technique using only openly available datasets (GigaWord, SemCore, OMSTI) and software (TensorFlow). From them, it emerged that state-of-the-art results can be obtained with much less data than hinted by Yuan et al. All code and trained models are made freely available.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis