Do deep nets really need weight decay and dropout?
Alex Hern\'andez-Garc\'ia, Peter K\"onig

TL;DR
This paper investigates whether weight decay and dropout are essential regularization techniques in deep neural networks, finding that with sufficient data augmentation, these methods may be unnecessary for object recognition tasks.
Contribution
The study challenges the conventional necessity of weight decay and dropout, showing they can be omitted with adequate data augmentation without sacrificing performance.
Findings
Weight decay and dropout may not be necessary with enough data augmentation.
Explicit regularization techniques might be redundant in overparameterized models.
Deep networks can achieve high accuracy without traditional regularization methods.
Abstract
The impressive success of modern deep neural networks on computer vision tasks has been achieved through models of very large capacity compared to the number of available training examples. This overparameterization is often said to be controlled with the help of different regularization techniques, mainly weight decay and dropout. However, since these techniques reduce the effective capacity of the model, typically even deeper and wider architectures are required to compensate for the reduced capacity. Therefore, there seems to be a waste of capacity in this practice. In this paper we build upon recent research that suggests that explicit regularization may not be as important as widely believed and carry out an ablation study that concludes that weight decay and dropout may not be necessary for object recognition if enough data augmentation is introduced.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Machine Learning and Data Classification
MethodsWeight Decay
