Centrality and Consistency: Two-Stage Clean Samples Identification for Learning with Instance-Dependent Noisy Labels
Ganlong Zhao, Guanbin Li, Yipeng Qin, Feng Liu, Yizhou Yu

TL;DR
This paper introduces a two-stage method for identifying clean samples in training data with instance-dependent noisy labels, improving deep learning robustness by addressing class imbalance and noise complexity.
Contribution
The paper proposes a novel two-stage approach combining class-level feature clustering and a consistency-based classifier to effectively identify clean samples under instance-dependent noise.
Findings
Outperforms state-of-the-art methods on challenging benchmarks.
Effectively handles class imbalance and complex noise patterns.
Improves model generalization with noisy labels.
Abstract
Deep models trained with noisy labels are prone to over-fitting and struggle in generalization. Most existing solutions are based on an ideal assumption that the label noise is class-conditional, i.e., instances of the same class share the same noise model, and are independent of features. While in practice, the real-world noise patterns are usually more fine-grained as instance-dependent ones, which poses a big challenge, especially in the presence of inter-class imbalance. In this paper, we propose a two-stage clean samples identification method to address the aforementioned challenge. First, we employ a class-level feature clustering procedure for the early identification of clean samples that are near the class-wise prediction centers. Notably, we address the class imbalance problem by aggregating rare classes according to their prediction entropy. Second, for the remaining clean…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Anomaly Detection Techniques and Applications · Imbalanced Data Classification Techniques
