Improving Generalization by Controlling Label-Noise Information in Neural Network Weights
Hrayr Harutyunyan, Kyle Reing, Greg Ver Steeg, Aram Galstyan

TL;DR
This paper introduces a novel approach to improve neural network generalization in noisy label scenarios by controlling label-noise information in weights, using an auxiliary network to reduce memorization of noise.
Contribution
It proposes training algorithms that minimize label-noise information in weights via an auxiliary network predicting gradients without labels, enhancing robustness to noisy data.
Findings
Reduced memorization of label noise in experiments
Improved generalization on noisy MNIST, CIFAR datasets
Effective on large-scale Clothing1M dataset with noisy labels
Abstract
In the presence of noisy or incorrect labels, neural networks have the undesirable tendency to memorize information about the noise. Standard regularization techniques such as dropout, weight decay or data augmentation sometimes help, but do not prevent this behavior. If one considers neural network weights as random variables that depend on the data and stochasticity of training, the amount of memorized information can be quantified with the Shannon mutual information between weights and the vector of all training labels given inputs, . We show that for any training algorithm, low values of this term correspond to reduction in memorization of label-noise and better generalization bounds. To obtain these low values, we propose training algorithms that employ an auxiliary network that predicts gradients in the final layers of a classifier without…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsMachine Learning and Data Classification · Adversarial Robustness in Machine Learning · Neural Networks and Applications
MethodsWeight Decay
