PropMix: Hard Sample Filtering and Proportional MixUp for Learning with Noisy Labels
Filipe R. Cordeiro, Vasileios Belagiannis, Ian Reid, Gustavo Carneiro

TL;DR
PropMix is a novel learning algorithm that effectively filters hard noisy samples and uses MixUp augmentation to improve training with noisy labels, achieving state-of-the-art results across multiple datasets.
Contribution
It introduces PropMix, a method that filters hard noisy samples and leverages MixUp to enhance learning with noisy labels, especially in high noise scenarios.
Findings
PropMix achieves state-of-the-art results on CIFAR-10/-100, Red Mini-ImageNet, Clothing1M, and WebVision.
PropMix outperforms existing methods in severe label noise benchmarks.
Self-supervised pre-training enhances robustness to high noisy label scenarios.
Abstract
The most competitive noisy label learning methods rely on an unsupervised classification of clean and noisy samples, where samples classified as noisy are re-labelled and "MixMatched" with the clean samples. These methods have two issues in large noise rate problems: 1) the noisy set is more likely to contain hard samples that are in-correctly re-labelled, and 2) the number of samples produced by MixMatch tends to be reduced because it is constrained by the small clean set size. In this paper, we introduce the learning algorithm PropMix to handle the issues above. PropMix filters out hard noisy samples, with the goal of increasing the likelihood of correctly re-labelling the easy noisy samples. Also, PropMix places clean and re-labelled easy noisy samples in a training set that is augmented with MixUp, removing the clean set size constraint and including a large proportion of correctly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Text and Document Classification Technologies · Domain Adaptation and Few-Shot Learning
