Label-Noise Learning with Intrinsically Long-Tailed Data
Yang Lu, Yiliang Zhang, Bo Han, Yiu-ming Cheung, Hanzi Wang

TL;DR
This paper introduces a novel two-stage sample selection framework called TABASCO designed to improve label-noise learning in long-tailed datasets, effectively distinguishing clean from noisy samples especially in tail classes.
Contribution
The paper proposes a new bi-dimensional sample selection method tailored for long-tailed data with label noise, addressing limitations of existing methods in imbalanced scenarios.
Findings
TABASCO outperforms existing methods on benchmark datasets.
The dual-metric approach improves noise separation in tail classes.
Extensive experiments validate the effectiveness of the proposed framework.
Abstract
Label noise is one of the key factors that lead to the poor generalization of deep learning models. Existing label-noise learning methods usually assume that the ground-truth classes of the training data are balanced. However, the real-world data is often imbalanced, leading to the inconsistency between observed and intrinsic class distribution with label noises. In this case, it is hard to distinguish clean samples from noisy samples on the intrinsic tail classes with the unknown intrinsic class distribution. In this paper, we propose a learning framework for label-noise learning with intrinsically long-tailed data. Specifically, we propose two-stage bi-dimensional sample selection (TABASCO) to better separate clean samples from noisy samples, especially for the tail classes. TABASCO consists of two new separation metrics that complement each other to compensate for the limitation of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInfrastructure Maintenance and Monitoring · Machine Learning and Data Classification · Industrial Vision Systems and Defect Detection
