Learning with Imbalanced Noisy Data by Preventing Bias in Sample Selection
Huafeng Liu, Mengmeng Sheng, Zeren Sun, Yazhou Yao, Xian-Sheng Hua,, and Heng-Tao Shen

TL;DR
This paper introduces a novel approach to training deep models on imbalanced datasets with noisy labels by preventing bias in sample selection and employing label correction and regularization techniques.
Contribution
It proposes Class-Balance-based sample Selection (CBS), Confidence-based Sample Augmentation (CSA), and label correction with the Average Confidence Margin (ACM) to improve learning under noisy, imbalanced conditions.
Findings
Outperforms existing methods on synthetic and real-world datasets.
Effectively handles class imbalance and label noise.
Improves model robustness and accuracy.
Abstract
Learning with noisy labels has gained increasing attention because the inevitable imperfect labels in real-world scenarios can substantially hurt the deep model performance. Recent studies tend to regard low-loss samples as clean ones and discard high-loss ones to alleviate the negative impact of noisy labels. However, real-world datasets contain not only noisy labels but also class imbalance. The imbalance issue is prone to causing failure in the loss-based sample selection since the under-learning of tail classes also leans to produce high losses. To this end, we propose a simple yet effective method to address noisy labels in imbalanced datasets. Specifically, we propose Class-Balance-based sample Selection (CBS) to prevent the tail class samples from being neglected during training. We propose Confidence-based Sample Augmentation (CSA) for the chosen clean samples to enhance their…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Imbalanced Data Classification Techniques · Advanced Statistical Methods and Models
