Improving Label Error Detection and Elimination with Uncertainty Quantification
Johannes Jakubik, Michael V\"ossing, Manil Maskey, Christopher, W\"olfle, Gerhard Satzger

TL;DR
This paper introduces novel, model-agnostic algorithms for detecting label errors in supervised learning by leveraging advanced uncertainty quantification techniques, resulting in improved accuracy and dataset quality.
Contribution
The paper develops and evaluates new UQ-LED algorithms that outperform existing methods in label error detection and demonstrates their effectiveness in improving model accuracy.
Findings
UQ-LED algorithms outperform state-of-the-art confident learning methods.
Removing detected label errors improves classification accuracy.
Synthetic label errors can be realistically generated for testing.
Abstract
Identifying and handling label errors can significantly enhance the accuracy of supervised machine learning models. Recent approaches for identifying label errors demonstrate that a low self-confidence of models with respect to a certain label represents a good indicator of an erroneous label. However, latest work has built on softmax probabilities to measure self-confidence. In this paper, we argue that -- as softmax probabilities do not reflect a model's predictive uncertainty accurately -- label error detection requires more sophisticated measures of model uncertainty. Therefore, we develop a range of novel, model-agnostic algorithms for Uncertainty Quantification-Based Label Error Detection (UQ-LED), which combine the techniques of confident learning (CL), Monte Carlo Dropout (MCD), model uncertainty measures (e.g., entropy), and ensemble learning to enhance label error detection.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFault Detection and Control Systems
MethodsSoftmax · Dropout · Monte Carlo Dropout
