Learning with Confident Examples: Rank Pruning for Robust Classification with Noisy Labels
Curtis G. Northcutt, Tailin Wu, Isaac L. Chuang

TL;DR
This paper introduces Rank Pruning, a fast and general method for robust binary classification with noisy labels, capable of accurately estimating noise rates and achieving state-of-the-art results on standard datasets.
Contribution
Rank Pruning is a novel, efficient approach that estimates noise rates and maintains optimal risk in noisy label scenarios, outperforming prior methods.
Findings
RP achieves state-of-the-art noise estimation accuracy.
RP with CNN predicts MNIST digits with less than 0.5% error under high noise.
RP performs well even with noise from a third distribution.
Abstract
Noisy PN learning is the problem of binary classification when training examples may be mislabeled (flipped) uniformly with noise rate rho1 for positive examples and rho0 for negative examples. We propose Rank Pruning (RP) to solve noisy PN learning and the open problem of estimating the noise rates, i.e. the fraction of wrong positive and negative labels. Unlike prior solutions, RP is time-efficient and general, requiring O(T) for any unrestricted choice of probabilistic classifier with T fitting time. We prove RP has consistent noise estimation and equivalent expected risk as learning with uncorrupted labels in ideal conditions, and derive closed-form solutions when conditions are non-ideal. RP achieves state-of-the-art noise estimation and F1, error, and AUC-PR for both MNIST and CIFAR datasets, regardless of the amount of noise and performs similarly impressively when a large…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Imbalanced Data Classification Techniques · Anomaly Detection Techniques and Applications
MethodsPruning
