Understanding and Utilizing Deep Neural Networks Trained with Noisy Labels
Pengfei Chen, Benben Liao, Guangyong Chen, Shengyu Zhang

TL;DR
This paper analyzes how noisy labels affect deep neural network performance, revealing a quadratic relationship with noise ratio, and proposes a robust training method using cross-validation and Co-teaching to improve accuracy.
Contribution
It introduces a quantitative analysis of test accuracy under noisy labels and a novel training strategy combining cross-validation and Co-teaching for robustness.
Findings
Test accuracy is a quadratic function of noise ratio under symmetric noise.
The proposed method outperforms state-of-the-art techniques on synthetic and real-world noisy datasets.
Cross-validation effectively identifies correctly labeled samples for robust training.
Abstract
Noisy labels are ubiquitous in real-world datasets, which poses a challenge for robustly training deep neural networks (DNNs) as DNNs usually have the high capacity to memorize the noisy labels. In this paper, we find that the test accuracy can be quantitatively characterized in terms of the noise ratio in datasets. In particular, the test accuracy is a quadratic function of the noise ratio in the case of symmetric noise, which explains the experimental findings previously published. Based on our analysis, we apply cross-validation to randomly split noisy datasets, which identifies most samples that have correct labels. Then we adopt the Co-teaching strategy which takes full advantage of the identified samples to train DNNs robustly against noisy labels. Compared with extensive state-of-the-art methods, our strategy consistently improves the generalization performance of DNNs under both…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Water Systems and Optimization · Anomaly Detection Techniques and Applications
