A Survey of Label-noise Representation Learning: Past, Present and   Future

Bo Han; Quanming Yao; Tongliang Liu; Gang Niu; Ivor W. Tsang; James T.; Kwok; Masashi Sugiyama

arXiv:2011.04406·cs.LG·February 23, 2021·101 cites

A Survey of Label-noise Representation Learning: Past, Present and Future

Bo Han, Quanming Yao, Tongliang Liu, Gang Niu, Ivor W. Tsang, James T., Kwok, Masashi Sugiyama

PDF

Open Access 1 Repo

TL;DR

This survey comprehensively reviews Label-Noise Representation Learning (LNRL), analyzing its theoretical foundations, categorizing existing methods, and proposing future research directions for robust deep learning with noisy labels.

Contribution

It provides a formal definition of LNRL, categorizes existing methods, discusses their strengths and weaknesses, and suggests new research avenues and datasets.

Findings

01

Categorized LNRL methods into three main directions.

02

Identified key components for robust LNRL.

03

Proposed future research directions including new datasets and noise types.

Abstract

Classical machine learning implicitly assumes that labels of the training data are sampled from a clean distribution, which can be too restrictive for real-world scenarios. However, statistical-learning-based methods may not train deep learning models robustly with these noisy labels. Therefore, it is urgent to design Label-Noise Representation Learning (LNRL) methods for robustly training deep models with noisy labels. To fully understand LNRL, we conduct a survey study. We first clarify a formal definition for LNRL from the perspective of machine learning. Then, via the lens of learning theory and empirical study, we figure out why noisy labels affect deep models' performance. Based on the theoretical guidance, we categorize different LNRL methods into three directions. Under this unified taxonomy, we provide a thorough discussion of the pros and cons of different categories. More…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

bhanML/label-noise-papers
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Machine Learning and Algorithms · Imbalanced Data Classification Techniques