Clusterability as an Alternative to Anchor Points When Learning with   Noisy Labels

Zhaowei Zhu; Yiwen Song; Yang Liu

arXiv:2102.05291·cs.LG·July 15, 2021·22 cites

Clusterability as an Alternative to Anchor Points When Learning with Noisy Labels

Zhaowei Zhu, Yiwen Song, Yang Liu

PDF

Open Access 2 Repos 1 Video

TL;DR

This paper introduces a novel method for estimating label noise transition matrices using clusterability conditions, avoiding the need for anchor points, and demonstrates improved accuracy on synthetic and real noisy datasets.

Contribution

The paper proposes a clusterability-based estimation procedure for transition matrices, offering better sample complexity and accuracy compared to anchor point methods.

Findings

01

Effective estimation with synthetic noisy labels on CIFAR-10/100

02

Improved estimation accuracy on real human-annotated datasets

03

Method benefits from using more instances and better sample complexity

Abstract

The label noise transition matrix, characterizing the probabilities of a training instance being wrongly annotated, is crucial to designing popular solutions to learning with noisy labels. Existing works heavily rely on finding "anchor points" or their approximates, defined as instances belonging to a particular class almost surely. Nonetheless, finding anchor points remains a non-trivial task, and the estimation accuracy is also often throttled by the number of available anchor points. In this paper, we propose an alternative option to the above task. Our main contribution is the discovery of an efficient estimation procedure based on a clusterability condition. We prove that with clusterable representations of features, using up to third-order consensuses of noisy labels among neighbor representations is sufficient to estimate a unique transition matrix. Compared with methods using…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

Clusterability as an Alternative to Anchor Points When Learning with Noisy Labels· slideslive

Taxonomy

TopicsMachine Learning and Data Classification · Infrastructure Maintenance and Monitoring · Machine Learning and Algorithms

MethodsHigh-Order Consensuses