Same but Different: Distant Supervision for Predicting and Understanding   Entity Linking Difficulty

Renato Stoffalette Jo\~ao; Pavlos Fafalios; Stefan Dietze

arXiv:1812.10387·cs.CL·July 30, 2021·1 cites

Same but Different: Distant Supervision for Predicting and Understanding Entity Linking Difficulty

Renato Stoffalette Jo\~ao, Pavlos Fafalios, Stefan Dietze

PDF

Open Access

TL;DR

This paper introduces a method to predict the difficulty of entity linking in texts, helping improve semi-automated systems by identifying challenging mentions and understanding factors influencing linking performance.

Contribution

It proposes a consensus-based approach to label mention difficulty and trains a classifier to predict difficulty, revealing latent features affecting entity linking accuracy.

Findings

01

High accuracy in predicting EL difficulty.

02

Latent corpus-specific features influence EL performance.

03

Method improves semi-automated EL pipelines.

Abstract

Entity Linking (EL) is the task of automatically identifying entity mentions in a piece of text and resolving them to a corresponding entity in a reference knowledge base like Wikipedia. There is a large number of EL tools available for different types of documents and domains, yet EL remains a challenging task where the lack of precision on particularly ambiguous mentions often spoils the usefulness of automated disambiguation results in real applications. A priori approximations of the difficulty to link a particular entity mention can facilitate flagging of critical cases as part of semi-automated EL systems, while detecting latent factors that affect the EL performance, like corpus-specific features, can provide insights on how to improve a system based on the special characteristics of the underlying corpus. In this paper, we first introduce a consensus-based method to generate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Advanced Graph Neural Networks · Text Readability and Simplification