# Named Entity Disambiguation for Noisy Text

**Authors:** Yotam Eshel, Noam Cohen, Kira Radinsky, Shaul Markovitch, Ikuya, Yamada, Omer Levy

arXiv: 1706.09147 · 2017-07-04

## TL;DR

This paper introduces WikilinksNED, a large-scale noisy text dataset for Named Entity Disambiguation, along with a neural model that effectively handles limited context and noise, outperforming existing methods.

## Contribution

The paper presents a new noisy dataset, a novel sampling method for training, and improved embedding initialization techniques for NED.

## Key findings

- Model outperforms state-of-the-art on WikilinksNED
- Achieves comparable results on newswire datasets
- Introduces a new approach for noisy text NED

## Abstract

We address the task of Named Entity Disambiguation (NED) for noisy text. We present WikilinksNED, a large-scale NED dataset of text fragments from the web, which is significantly noisier and more challenging than existing news-based datasets. To capture the limited and noisy local context surrounding each mention, we design a neural model and train it with a novel method for sampling informative negative examples. We also describe a new way of initializing word and entity embeddings that significantly improves performance. Our model significantly outperforms existing state-of-the-art methods on WikilinksNED while achieving comparable performance on a smaller newswire dataset.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1706.09147/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/1706.09147/full.md

## References

37 references — full list in the complete paper: https://tomesphere.com/paper/1706.09147/full.md

---
Source: https://tomesphere.com/paper/1706.09147