Adaptive Label Error Detection: A Bayesian Approach to Mislabeled Data Detection
Zan Chaudhry, Noam H. Rotenberg, Brian Caffo, Craig K. Jones, Haris I. Sair

TL;DR
This paper introduces ALED, a Bayesian method that detects mislabeled data in deep learning by modeling class features with Gaussian distributions, significantly improving label error detection in medical imaging datasets.
Contribution
ALED is a novel Bayesian approach that models class features with Gaussian distributions and uses likelihood ratios for improved label error detection.
Findings
ALED outperforms existing methods in sensitivity and precision.
Correcting labels with ALED reduces test errors by 33.8%.
ALED is available in the statlab Python package.
Abstract
Machine learning classification systems are susceptible to poor performance when trained with incorrect ground truth labels, even when data is well-curated by expert annotators. As machine learning becomes more widespread, it is increasingly imperative to identify and correct mislabeling to develop more powerful models. In this work, we motivate and describe Adaptive Label Error Detection (ALED), a novel method of detecting mislabeling. ALED extracts an intermediate feature space from a deep convolutional neural network, denoises the features, models the reduced manifold of each class with a multidimensional Gaussian distribution, and performs a simple likelihood ratio test to identify mislabeled samples. We show that ALED has markedly increased sensitivity, without compromising precision, compared to established label error detection methods, on multiple medical imaging datasets. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Explainable Artificial Intelligence (XAI) · Advanced Neural Network Applications
