Detecting Label Errors by using Pre-Trained Language Models
Derek Chong, Jenny Hong, Christopher D. Manning

TL;DR
Pre-trained language models can effectively detect label errors in natural language datasets by ranking data points based on task loss, outperforming previous methods and providing a new evaluation standard with human-originated noise.
Contribution
The paper introduces a simple yet effective method using pre-trained models for label error detection and proposes a realistic human-originated noise benchmark for evaluation.
Findings
Pre-trained models outperform previous error detection methods.
Human-originated noise is more challenging and realistic than synthetic noise.
Models achieve 9-36% higher AUPRC in real error detection tasks.
Abstract
We show that large pre-trained language models are inherently highly capable of identifying label errors in natural language datasets: simply examining out-of-sample data points in descending order of fine-tuned task loss significantly outperforms more complex error-detection mechanisms proposed in previous work. To this end, we contribute a novel method for introducing realistic, human-originated label noise into existing crowdsourced datasets such as SNLI and TweetNLP. We show that this noise has similar properties to real, hand-verified label errors, and is harder to detect than existing synthetic noise, creating challenges for model robustness. We argue that human-originated noise is a better standard for evaluation than synthetic noise. Finally, we use crowdsourced verification to evaluate the detection of real errors on IMDB, Amazon Reviews, and Recon, and confirm that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
