Learning from Long-Tailed Noisy Data with Sample Selection and Balanced Loss
Lefan Zhang, Zhang-Hao Tian, Wujun Zhou, Wei Wang

TL;DR
This paper introduces a robust semi-supervised learning method for long-tailed noisy data that uses sample selection and a balanced loss to improve deep learning performance in real-world noisy, imbalanced datasets.
Contribution
The paper proposes a novel approach combining sample selection and a balanced loss to effectively learn from long-tailed noisy datasets, outperforming existing methods.
Findings
Outperforms state-of-the-art methods on benchmark datasets.
Effectively separates clean and noisy data for training.
Improves robustness of deep neural networks in noisy, imbalanced scenarios.
Abstract
The success of deep learning depends on large-scale and well-curated training data, while data in real-world applications are commonly long-tailed and noisy. Many methods have been proposed to deal with long-tailed data or noisy data, while a few methods are developed to tackle long-tailed noisy data. To solve this, we propose a robust method for learning from long-tailed noisy data with sample selection and balanced loss. Specifically, we separate the noisy training data into clean labeled set and unlabeled set with sample selection, and train the deep neural network in a semi-supervised manner with a balanced loss based on model bias. Extensive experiments on benchmarks demonstrate that our method outperforms existing state-of-the-art methods.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques · Machine Learning and Data Classification
