Automated Testing and Improvement of Named Entity Recognition Systems
Boxi Yu, Yiyan Hu, Qiuyang Mang, Wenhan Hu, Pinjia He

TL;DR
This paper introduces TIN, an automated testing and repairing framework for NER systems that improves their accuracy by identifying and fixing errors through consistency checks across similar contexts and entities.
Contribution
The paper presents a novel, widely applicable method for automatically testing and repairing NER systems, significantly reducing errors and improving reliability.
Findings
High precision in error detection (85.0%-93.4%)
Error reduction rate of 26.8%-50.6%
Successful repair of 1,056 errors out of 1,877 reported
Abstract
Named entity recognition (NER) systems have seen rapid progress in recent years due to the development of deep neural networks. These systems are widely used in various natural language processing applications, such as information extraction, question answering, and sentiment analysis. However, the complexity and intractability of deep neural networks can make NER systems unreliable in certain circumstances, resulting in incorrect predictions. For example, NER systems may misidentify female names as chemicals or fail to recognize the names of minority groups, leading to user dissatisfaction. To tackle this problem, we introduce TIN, a novel, widely applicable approach for automatically testing and repairing various NER systems. The key idea for automated testing is that the NER predictions of the same named entities under similar contexts should be identical. The core idea for automated…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Data Quality and Management
Methodsfail
