USFD: Twitter NER with Drift Compensation and Linked Data
Leon Derczynski, Isabelle Augenstein, Kalina Bontcheva

TL;DR
This paper presents a Twitter NER system that uses linked data, clustering, and drift compensation to improve entity recognition in social media texts, demonstrating competitive results and detailed analysis.
Contribution
The novel integration of linked data, drift compensation, and unsupervised clustering in a Twitter NER system for the W-NUT 2015 shared task.
Findings
Competitive performance on Twitter NER dataset
Effective drift compensation techniques
Insightful analysis of system components
Abstract
This paper describes a pilot NER system for Twitter, comprising the USFD system entry to the W-NUT 2015 NER shared task. The goal is to correctly label entities in a tweet dataset, using an inventory of ten types. We employ structured learning, drawing on gazetteers taken from Linked Data, and on unsupervised clustering features, and attempting to compensate for stylistic and topic drift - a key challenge in social media text. Our result is competitive; we provide an analysis of the components of our methodology, and an examination of the target dataset in the context of this task.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
