Fine-Grained Entity Typing for Domain Independent Entity Linking

Yasumasa Onoe; Greg Durrett

arXiv:1909.05780·cs.CL·January 9, 2020·6 cites

Fine-Grained Entity Typing for Domain Independent Entity Linking

Yasumasa Onoe, Greg Durrett

PDF

Open Access 1 Repo 1 Datasets

TL;DR

This paper presents a domain-independent entity linking approach that models fine-grained entity properties using Wikipedia data, improving generalization and performance on unseen datasets.

Contribution

The authors introduce a large inventory of entity types from Wikipedia and a typing-based linking method that outperforms prior models in domain-independent scenarios.

Findings

01

Outperforms prior domain-independent entity linking systems on CoNLL-YAGO.

02

Generalizes better than neural models on unseen mention-entity pairs.

03

Uses large-scale Wikipedia-derived entity types for robust disambiguation.

Abstract

Neural entity linking models are very powerful, but run the risk of overfitting to the domain they are trained in. For this problem, a domain is characterized not just by genre of text but even by factors as specific as the particular distribution of entities, as neural models tend to overfit by memorizing properties of frequent entities in a dataset. We tackle the problem of building robust entity linking models that generalize effectively and do not rely on labeled entity linking data with a specific entity distribution. Rather than predicting entities directly, our approach models fine-grained entity properties, which can help disambiguate between even closely related entities. We derive a large inventory of types (tens of thousands) from Wikipedia categories, and use hyperlinked mentions in Wikipedia to distantly label data and train an entity typing model. At test time, we classify…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yasumasaonoe/ET4EL
pytorchOfficial

Datasets

naist-nlp/unseen
dataset· 21 dl
21 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification