Learning from Noisy Labels with Distillation

Yuncheng Li; Jianchao Yang; Yale Song; Liangliang Cao; Jiebo Luo,; Li-Jia Li

arXiv:1703.02391·cs.CV·April 11, 2017·59 cites

Learning from Noisy Labels with Distillation

Yuncheng Li, Jianchao Yang, Yale Song, Liangliang Cao, Jiebo Luo,, Li-Jia Li

PDF

Open Access

TL;DR

This paper introduces a distillation-based framework leveraging side information like clean data and label relations to improve learning from noisy labels, and provides new benchmarks for practical evaluation.

Contribution

It proposes a unified distillation approach that uses side information to better handle multi-mode noisy labels and introduces new benchmark datasets for realistic evaluation.

Findings

01

Effective across Sports, Species, and Artifacts domains

02

Outperforms traditional methods on new benchmarks

03

Demonstrates robustness to real-world noisy labels

Abstract

The ability of learning from noisy labels is very useful in many visual recognition tasks, as a vast amount of data with noisy labels are relatively easy to obtain. Traditionally, the label noises have been treated as statistical outliers, and approaches such as importance re-weighting and bootstrap have been proposed to alleviate the problem. According to our observation, the real-world noisy labels exhibit multi-mode characteristics as the true labels, rather than behaving like independent random outliers. In this work, we propose a unified distillation framework to use side information, including a small clean dataset and label relations in knowledge graph, to "hedge the risk" of learning from noisy labels. Furthermore, unlike the traditional approaches evaluated based on simulated label noises, we propose a suite of new benchmark datasets, in Sports, Species and Artifacts domains,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Imbalanced Data Classification Techniques · Machine Learning and Algorithms