Contrast to Divide: Self-Supervised Pre-Training for Learning with Noisy   Labels

Evgenii Zheltonozhskii; Chaim Baskin; Avi Mendelson; Alex M.; Bronstein; Or Litany

arXiv:2103.13646·cs.CV·February 22, 2022

Contrast to Divide: Self-Supervised Pre-Training for Learning with Noisy Labels

Evgenii Zheltonozhskii, Chaim Baskin, Avi Mendelson, Alex M., Bronstein, Or Litany

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces C2D, a self-supervised pre-training framework that enhances learning with noisy labels by improving feature quality and reducing warm-up time, leading to significant performance gains especially under high noise conditions.

Contribution

The paper proposes a novel self-supervised pre-training method called Contrast to Divide (C2D) that addresses the warm-up obstacle in noisy label learning, improving robustness and accuracy.

Findings

01

C2D boosts performance in high noise regimes, e.g., over 27% improvement on CIFAR-100 with 90% noise.

02

C2D outperforms previous methods on WebVision and ImageNet, with 3% higher top-1 accuracy.

03

Self-supervised pre-training reduces warm-up duration and noise susceptibility in label-noise learning.

Abstract

The success of learning with noisy labels (LNL) methods relies heavily on the success of a warm-up stage where standard supervised training is performed using the full (noisy) training set. In this paper, we identify a "warm-up obstacle": the inability of standard warm-up stages to train high quality feature extractors and avert memorization of noisy labels. We propose "Contrast to Divide" (C2D), a simple framework that solves this problem by pre-training the feature extractor in a self-supervised fashion. Using self-supervised pre-training boosts the performance of existing LNL approaches by drastically reducing the warm-up stage's susceptibility to noise level, shortening its duration, and improving extracted feature quality. C2D works out of the box with existing methods and demonstrates markedly improved performance, especially in the high noise regime, where we get a boost of more…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ContrastToDivide/C2D
pytorchOfficial

Videos

Contrast to Divide: Self-Supervised Pre-Training for Learning with Noisy Labels· youtube

Taxonomy

TopicsMachine Learning and Data Classification · Text and Document Classification Technologies · Domain Adaptation and Few-Shot Learning