Knockoffs-SPR: Clean Sample Selection in Learning with Noisy Labels
Yikai Wang, Yanwei Fu, and Xinwei Sun

TL;DR
This paper introduces a theoretically grounded framework called Knockoffs-SPR for selecting clean samples in noisy label learning, improving neural network robustness and generalization.
Contribution
It proposes a novel scalable penalized regression method with knockoff filters for accurate clean sample selection under noisy labels.
Findings
Effective in identifying clean data in noisy datasets
Controls false selection rate in sample filtering
Improves neural network robustness and accuracy
Abstract
A noisy training set usually leads to the degradation of the generalization and robustness of neural networks. In this paper, we propose a novel theoretically guaranteed clean sample selection framework for learning with noisy labels. Specifically, we first present a Scalable Penalized Regression (SPR) method, to model the linear relation between network features and one-hot labels. In SPR, the clean data are identified by the zero mean-shift parameters solved in the regression model. We theoretically show that SPR can recover clean data under some conditions. Under general scenarios, the conditions may be no longer satisfied; and some noisy data are falsely selected as clean data. To solve this problem, we propose a data-adaptive method for Scalable Penalized Regression with Knockoff filters (Knockoffs-SPR), which is provable to control the False-Selection-Rate (FSR) in the selected…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Image and Signal Denoising Methods · Neural Networks and Applications
