Entity Tagging: Extracting Entities in Text Without Mention Supervision
Christina Du, Kashyap Popat, Louis Martin, Fabio Petroni

TL;DR
This paper shows that for entity extraction tasks focusing solely on the set of entities, mention boundary detection can be skipped, as a mention-agnostic model performs comparably to traditional mention-aware methods.
Contribution
The paper introduces the Entity Tagging formulation and demonstrates that mention detection is unnecessary for certain entity extraction tasks, enabling simpler models like GET to perform well.
Findings
GET achieves comparable performance to mention-aware models.
Mention detection does not significantly improve entity extraction in set-focused tasks.
Models trained on partial annotations perform well across benchmarks.
Abstract
Detection and disambiguation of all entities in text is a crucial task for a wide range of applications. The typical formulation of the problem involves two stages: detect mention boundaries and link all mentions to a knowledge base. For a long time, mention detection has been considered as a necessary step for extracting all entities in a piece of text, even if the information about mention spans is ignored by some downstream applications that merely focus on the set of extracted entities. In this paper we show that, in such cases, detection of mention boundaries does not bring any considerable performance gain in extracting entities, and therefore can be skipped. To conduct our analysis, we propose an "Entity Tagging" formulation of the problem, where models are evaluated purely on the set of extracted entities without considering mentions. We compare a state-of-the-art mention-aware…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques
