PropMix: Hard Sample Filtering and Proportional MixUp for Learning with   Noisy Labels

Filipe R. Cordeiro; Vasileios Belagiannis; Ian Reid; Gustavo Carneiro

arXiv:2110.11809·cs.CV·October 25, 2021

PropMix: Hard Sample Filtering and Proportional MixUp for Learning with Noisy Labels

Filipe R. Cordeiro, Vasileios Belagiannis, Ian Reid, Gustavo Carneiro

PDF

Open Access 1 Repo

TL;DR

PropMix is a novel learning algorithm that effectively filters hard noisy samples and uses MixUp augmentation to improve training with noisy labels, achieving state-of-the-art results across multiple datasets.

Contribution

It introduces PropMix, a method that filters hard noisy samples and leverages MixUp to enhance learning with noisy labels, especially in high noise scenarios.

Findings

01

PropMix achieves state-of-the-art results on CIFAR-10/-100, Red Mini-ImageNet, Clothing1M, and WebVision.

02

PropMix outperforms existing methods in severe label noise benchmarks.

03

Self-supervised pre-training enhances robustness to high noisy label scenarios.

Abstract

The most competitive noisy label learning methods rely on an unsupervised classification of clean and noisy samples, where samples classified as noisy are re-labelled and "MixMatched" with the clean samples. These methods have two issues in large noise rate problems: 1) the noisy set is more likely to contain hard samples that are in-correctly re-labelled, and 2) the number of samples produced by MixMatch tends to be reduced because it is constrained by the small clean set size. In this paper, we introduce the learning algorithm PropMix to handle the issues above. PropMix filters out hard noisy samples, with the goal of increasing the likelihood of correctly re-labelling the easy noisy samples. Also, PropMix places clean and re-labelled easy noisy samples in a training set that is augmented with MixUp, removing the clean set size constraint and including a large proportion of correctly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

filipe-research/propmix
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Text and Document Classification Technologies · Domain Adaptation and Few-Shot Learning