Contrast to Divide: Self-Supervised Pre-Training for Learning with Noisy Labels
Evgenii Zheltonozhskii, Chaim Baskin, Avi Mendelson, Alex M., Bronstein, Or Litany

TL;DR
This paper introduces C2D, a self-supervised pre-training framework that enhances learning with noisy labels by improving feature quality and reducing warm-up time, leading to significant performance gains especially under high noise conditions.
Contribution
The paper proposes a novel self-supervised pre-training method called Contrast to Divide (C2D) that addresses the warm-up obstacle in noisy label learning, improving robustness and accuracy.
Findings
C2D boosts performance in high noise regimes, e.g., over 27% improvement on CIFAR-100 with 90% noise.
C2D outperforms previous methods on WebVision and ImageNet, with 3% higher top-1 accuracy.
Self-supervised pre-training reduces warm-up duration and noise susceptibility in label-noise learning.
Abstract
The success of learning with noisy labels (LNL) methods relies heavily on the success of a warm-up stage where standard supervised training is performed using the full (noisy) training set. In this paper, we identify a "warm-up obstacle": the inability of standard warm-up stages to train high quality feature extractors and avert memorization of noisy labels. We propose "Contrast to Divide" (C2D), a simple framework that solves this problem by pre-training the feature extractor in a self-supervised fashion. Using self-supervised pre-training boosts the performance of existing LNL approaches by drastically reducing the warm-up stage's susceptibility to noise level, shortening its duration, and improving extracted feature quality. C2D works out of the box with existing methods and demonstrates markedly improved performance, especially in the high noise regime, where we get a boost of more…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Contrast to Divide: Self-Supervised Pre-Training for Learning with Noisy Labels· youtube
Taxonomy
TopicsMachine Learning and Data Classification · Text and Document Classification Technologies · Domain Adaptation and Few-Shot Learning
