An Instance-Dependent Simulation Framework for Learning with Label Noise
Keren Gu, Xander Masotto, Vandana Bachani, Balaji Lakshminarayanan,, Jack Nikodem, Dong Yin

TL;DR
This paper introduces an instance-dependent simulation framework for generating realistic noisy labels, enabling better evaluation of learning algorithms and proposing a label correction method that improves performance.
Contribution
The paper presents a novel simulation framework for realistic label noise and a new label correction technique leveraging annotator features.
Findings
Synthetic noisy labels are closer to human labels than traditional methods.
Benchmarking reveals varying algorithm robustness to different noise types.
LQM improves model performance when integrated with existing noisy label techniques.
Abstract
We propose a simulation framework for generating instance-dependent noisy labels via a pseudo-labeling paradigm. We show that the distribution of the synthetic noisy labels generated with our framework is closer to human labels compared to independent and class-conditional random flipping. Equipped with controllable label noise, we study the negative impact of noisy labels across a few practical settings to understand when label noise is more problematic. We also benchmark several existing algorithms for learning with noisy labels and compare their behavior on our synthetic datasets and on the datasets with independent random label noise. Additionally, with the availability of annotator information from our simulation framework, we propose a new technique, Label Quality Model (LQM), that leverages annotator features to predict and correct against noisy labels. We show that by adding LQM…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Music and Audio Processing · Water Systems and Optimization
MethodsLabel Quality Model
