Clusterability as an Alternative to Anchor Points When Learning with Noisy Labels
Zhaowei Zhu, Yiwen Song, Yang Liu

TL;DR
This paper introduces a novel method for estimating label noise transition matrices using clusterability conditions, avoiding the need for anchor points, and demonstrates improved accuracy on synthetic and real noisy datasets.
Contribution
The paper proposes a clusterability-based estimation procedure for transition matrices, offering better sample complexity and accuracy compared to anchor point methods.
Findings
Effective estimation with synthetic noisy labels on CIFAR-10/100
Improved estimation accuracy on real human-annotated datasets
Method benefits from using more instances and better sample complexity
Abstract
The label noise transition matrix, characterizing the probabilities of a training instance being wrongly annotated, is crucial to designing popular solutions to learning with noisy labels. Existing works heavily rely on finding "anchor points" or their approximates, defined as instances belonging to a particular class almost surely. Nonetheless, finding anchor points remains a non-trivial task, and the estimation accuracy is also often throttled by the number of available anchor points. In this paper, we propose an alternative option to the above task. Our main contribution is the discovery of an efficient estimation procedure based on a clusterability condition. We prove that with clusterable representations of features, using up to third-order consensuses of noisy labels among neighbor representations is sufficient to estimate a unique transition matrix. Compared with methods using…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsMachine Learning and Data Classification · Infrastructure Maintenance and Monitoring · Machine Learning and Algorithms
MethodsHigh-Order Consensuses
