Mitigating Instance-Dependent Label Noise: Integrating Self-Supervised Pretraining with Pseudo-Label Refinement
Gouranga Bala, Anuj Gupta, Subrat Kumar Behera, Amit Sethi

TL;DR
This paper introduces a hybrid approach combining self-supervised pretraining with iterative pseudo-label refinement to effectively mitigate the impact of instance-dependent label noise in deep learning models, leading to improved robustness and accuracy.
Contribution
It proposes a novel framework that leverages self-supervised learning and pseudo-label refinement specifically to address the challenges of instance-dependent label noise.
Findings
Outperforms state-of-the-art methods under high noise levels
Achieves significant accuracy improvements on CIFAR datasets with synthetic noise
Demonstrates robustness of the approach across varying noise conditions
Abstract
Deep learning models rely heavily on large volumes of labeled data to achieve high performance. However, real-world datasets often contain noisy labels due to human error, ambiguity, or resource constraints during the annotation process. Instance-dependent label noise (IDN), where the probability of a label being corrupted depends on the input features, poses a significant challenge because it is more prevalent and harder to address than instance-independent noise. In this paper, we propose a novel hybrid framework that combines self-supervised learning using SimCLR with iterative pseudo-label refinement to mitigate the effects of IDN. The self-supervised pre-training phase enables the model to learn robust feature representations without relying on potentially noisy labels, establishing a noise-agnostic foundation. Subsequently, we employ an iterative training process with pseudo-label…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing
MethodsBitcoin Customer Service Number +1-833-534-1729 · Average Pooling · Global Average Pooling · Kaiming Initialization · Convolution · Max Pooling · Random Gaussian Blur · Color Jitter · *Communicated@Fast*How Do I Communicate to Expedia? · Dense Connections
