Centrality and Consistency: Two-Stage Clean Samples Identification for   Learning with Instance-Dependent Noisy Labels

Ganlong Zhao; Guanbin Li; Yipeng Qin; Feng Liu; Yizhou Yu

arXiv:2207.14476·cs.CV·August 1, 2022·5 cites

Centrality and Consistency: Two-Stage Clean Samples Identification for Learning with Instance-Dependent Noisy Labels

Ganlong Zhao, Guanbin Li, Yipeng Qin, Feng Liu, Yizhou Yu

PDF

Open Access 1 Repo

TL;DR

This paper introduces a two-stage method for identifying clean samples in training data with instance-dependent noisy labels, improving deep learning robustness by addressing class imbalance and noise complexity.

Contribution

The paper proposes a novel two-stage approach combining class-level feature clustering and a consistency-based classifier to effectively identify clean samples under instance-dependent noise.

Findings

01

Outperforms state-of-the-art methods on challenging benchmarks.

02

Effectively handles class imbalance and complex noise patterns.

03

Improves model generalization with noisy labels.

Abstract

Deep models trained with noisy labels are prone to over-fitting and struggle in generalization. Most existing solutions are based on an ideal assumption that the label noise is class-conditional, i.e., instances of the same class share the same noise model, and are independent of features. While in practice, the real-world noise patterns are usually more fine-grained as instance-dependent ones, which poses a big challenge, especially in the presence of inter-class imbalance. In this paper, we propose a two-stage clean samples identification method to address the aforementioned challenge. First, we employ a class-level feature clustering procedure for the early identification of clean samples that are near the class-wise prediction centers. Notably, we address the class imbalance problem by aggregating rare classes according to their prediction entropy. Second, for the remaining clean…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

uitrbn/tscsi_idn
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Anomaly Detection Techniques and Applications · Imbalanced Data Classification Techniques