A Survey of Label-noise Representation Learning: Past, Present and Future
Bo Han, Quanming Yao, Tongliang Liu, Gang Niu, Ivor W. Tsang, James T., Kwok, Masashi Sugiyama

TL;DR
This survey comprehensively reviews Label-Noise Representation Learning (LNRL), analyzing its theoretical foundations, categorizing existing methods, and proposing future research directions for robust deep learning with noisy labels.
Contribution
It provides a formal definition of LNRL, categorizes existing methods, discusses their strengths and weaknesses, and suggests new research avenues and datasets.
Findings
Categorized LNRL methods into three main directions.
Identified key components for robust LNRL.
Proposed future research directions including new datasets and noise types.
Abstract
Classical machine learning implicitly assumes that labels of the training data are sampled from a clean distribution, which can be too restrictive for real-world scenarios. However, statistical-learning-based methods may not train deep learning models robustly with these noisy labels. Therefore, it is urgent to design Label-Noise Representation Learning (LNRL) methods for robustly training deep models with noisy labels. To fully understand LNRL, we conduct a survey study. We first clarify a formal definition for LNRL from the perspective of machine learning. Then, via the lens of learning theory and empirical study, we figure out why noisy labels affect deep models' performance. Based on the theoretical guidance, we categorize different LNRL methods into three directions. Under this unified taxonomy, we provide a thorough discussion of the pros and cons of different categories. More…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Machine Learning and Algorithms · Imbalanced Data Classification Techniques
