Learning from Noisy Labels with Distillation
Yuncheng Li, Jianchao Yang, Yale Song, Liangliang Cao, Jiebo Luo,, Li-Jia Li

TL;DR
This paper introduces a distillation-based framework leveraging side information like clean data and label relations to improve learning from noisy labels, and provides new benchmarks for practical evaluation.
Contribution
It proposes a unified distillation approach that uses side information to better handle multi-mode noisy labels and introduces new benchmark datasets for realistic evaluation.
Findings
Effective across Sports, Species, and Artifacts domains
Outperforms traditional methods on new benchmarks
Demonstrates robustness to real-world noisy labels
Abstract
The ability of learning from noisy labels is very useful in many visual recognition tasks, as a vast amount of data with noisy labels are relatively easy to obtain. Traditionally, the label noises have been treated as statistical outliers, and approaches such as importance re-weighting and bootstrap have been proposed to alleviate the problem. According to our observation, the real-world noisy labels exhibit multi-mode characteristics as the true labels, rather than behaving like independent random outliers. In this work, we propose a unified distillation framework to use side information, including a small clean dataset and label relations in knowledge graph, to "hedge the risk" of learning from noisy labels. Furthermore, unlike the traditional approaches evaluated based on simulated label noises, we propose a suite of new benchmark datasets, in Sports, Species and Artifacts domains,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Imbalanced Data Classification Techniques · Machine Learning and Algorithms
