CoNLL#: Fine-grained Error Analysis and a Corrected Test Set for   CoNLL-03 English

Andrew Rueda; Elena \'Alvarez Mellado; Constantine Lignos

arXiv:2405.11865·cs.CL·May 21, 2024

CoNLL#: Fine-grained Error Analysis and a Corrected Test Set for CoNLL-03 English

Andrew Rueda, Elena \'Alvarez Mellado, Constantine Lignos

PDF

Open Access

TL;DR

This paper analyzes the limitations of current NER models on CoNLL-03 English, introduces detailed error categorization, and presents a corrected test set to improve interpretability and future research directions.

Contribution

It provides a comprehensive error analysis of top NER models and introduces CoNLL#, a revised test set with systematic corrections for more accurate evaluation.

Findings

01

Identified systematic errors in the original test set.

02

Achieved more precise error attribution with new annotations.

03

Provided a corrected test set for improved benchmarking.

Abstract

Modern named entity recognition systems have steadily improved performance in the age of larger and more powerful neural models. However, over the past several years, the state-of-the-art has seemingly hit another plateau on the benchmark CoNLL-03 English dataset. In this paper, we perform a deep dive into the test outputs of the highest-performing NER models, conducting a fine-grained evaluation of their performance by introducing new document-level annotations on the test set. We go beyond F1 scores by categorizing errors in order to interpret the true state of the art for NER and guide future work. We review previous attempts at correcting the various flaws of the test set and introduce CoNLL#, a new corrected version of the test set that addresses its systematic and most prevalent errors, allowing for low-noise, interpretable error analysis.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Text Readability and Simplification

MethodsSparse Evolutionary Training