Impact of Label Noise on Learning Complex Features
Rahul Vashisht, P. Krishna Kumar, Harsha Vardhan Govind and, Harish G. Ramaswamy

TL;DR
Pre-training neural networks with noisy labels can enhance their ability to learn complex, diverse features and overcome the bias towards simpler decision boundaries, without sacrificing performance.
Contribution
This work demonstrates that noisy label pre-training promotes learning of complex features and diverse representations in neural networks, addressing limitations of traditional regularization methods.
Findings
Pre-training with noisy labels encourages learning complex functions.
Pre-training leads to models capturing broader feature sets.
Performance remains unaffected despite learning more complex features.
Abstract
Neural networks trained with stochastic gradient descent exhibit an inductive bias towards simpler decision boundaries, typically converging to a narrow family of functions, and often fail to capture more complex features. This phenomenon raises concerns about the capacity of deep models to adequately learn and represent real-world datasets. Traditional approaches such as explicit regularization, data augmentation, architectural modifications, etc., have largely proven ineffective in encouraging the models to learn diverse features. In this work, we investigate the impact of pre-training models with noisy labels on the dynamics of SGD across various architectures and datasets. We show that pretraining promotes learning complex functions and diverse features in the presence of noise. Our experiments demonstrate that pre-training with noisy labels encourages gradient descent to find…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSparse Evolutionary Training · Stochastic Gradient Descent
