Beyond Synthetic Noise: Deep Learning on Controlled Noisy Labels
Lu Jiang, Di Huang, Mason Liu, Weilong Yang

TL;DR
This paper introduces a benchmark for real-world web label noise, proposes a method to handle noisy labels effectively, and conducts extensive analysis of deep learning performance across various noise conditions.
Contribution
It establishes the first controlled real-world label noise benchmark, presents a new noise-robust training method, and provides a comprehensive study of deep neural networks on noisy data.
Findings
Our method outperforms existing approaches on multiple benchmarks.
Deep neural networks exhibit varying robustness depending on noise type and level.
Controlled experiments reveal insights into training dynamics with noisy labels.
Abstract
Performing controlled experiments on noisy data is essential in understanding deep learning across noise levels. Due to the lack of suitable datasets, previous research has only examined deep learning on controlled synthetic label noise, and real-world label noise has never been studied in a controlled setting. This paper makes three contributions. First, we establish the first benchmark of controlled real-world label noise from the web. This new benchmark enables us to study the web label noise in a controlled setting for the first time. The second contribution is a simple but effective method to overcome both synthetic and real noisy labels. We show that our method achieves the best result on our dataset as well as on two public benchmarks (CIFAR and WebVision). Third, we conduct the largest study by far into understanding deep neural networks trained on noisy labels across different…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsMachine Learning and Data Classification · Explainable Artificial Intelligence (XAI)
