DivideMix: Learning with Noisy Labels as Semi-supervised Learning

Junnan Li; Richard Socher; Steven C.H. Hoi

arXiv:2002.07394·cs.CV·February 20, 2020·500 cites

DivideMix: Learning with Noisy Labels as Semi-supervised Learning

Junnan Li, Richard Socher, Steven C.H. Hoi

PDF

Open Access 2 Repos

TL;DR

DivideMix introduces a semi-supervised learning framework that effectively handles noisy labels by dynamically dividing data into clean and noisy sets, improving training accuracy on benchmark datasets.

Contribution

The paper presents a novel mixture model-based data division method and a divergence training scheme to enhance learning with noisy labels using semi-supervised techniques.

Findings

01

Significant performance improvements over state-of-the-art methods.

02

Effective noise handling through dynamic data division.

03

Enhanced semi-supervised training with label co-refinement.

Abstract

Deep neural networks are known to be annotation-hungry. Numerous efforts have been devoted to reducing the annotation cost when learning with deep networks. Two prominent directions include learning with noisy labels and semi-supervised learning by exploiting unlabeled data. In this work, we propose DivideMix, a novel framework for learning with noisy labels by leveraging semi-supervised learning techniques. In particular, DivideMix models the per-sample loss distribution with a mixture model to dynamically divide the training data into a labeled set with clean samples and an unlabeled set with noisy samples, and trains the model on both the labeled and unlabeled data in a semi-supervised manner. To avoid confirmation bias, we simultaneously train two diverged networks where each network uses the dataset division from the other network. During the semi-supervised training phase, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Advanced Neural Network Applications · Machine Learning and Algorithms