CrossWeigh: Training Named Entity Tagger from Imperfect Annotations
Zihan Wang, Jingbo Shang, Liyuan Liu, Lihao Lu, Jiacheng Liu, Jiawei, Han

TL;DR
This paper identifies label mistakes in NER datasets, corrects them for a cleaner test set, and introduces CrossWeigh, a framework that improves NER training by handling label noise through data partitioning and weighted training.
Contribution
It provides a corrected NER test set for more accurate evaluation and proposes CrossWeigh, a novel method for training NER models robust to label mistakes.
Findings
Corrected label mistakes in about 5.38% of test sentences.
CrossWeigh improves NER model performance across multiple datasets.
Re-evaluation with corrected labels yields more accurate model assessments.
Abstract
Everyone makes mistakes. So do human annotators when curating labels for named entity recognition (NER). Such label mistakes might hurt model training and interfere model comparison. In this study, we dive deep into one of the widely-adopted NER benchmark datasets, CoNLL03 NER. We are able to identify label mistakes in about 5.38% test sentences, which is a significant ratio considering that the state-of-the-art test F1 score is already around 93%. Therefore, we manually correct these label mistakes and form a cleaner test set. Our re-evaluation of popular models on this corrected test set leads to more accurate assessments, compared to those on the original test set. More importantly, we propose a simple yet effective framework, CrossWeigh, to handle label mistakes during NER model training. Specifically, it partitions the training data into several folds and train independent NER…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text and Document Classification Technologies
