Data augmentation instead of explicit regularization
Alex Hern\'andez-Garc\'ia, Peter K\"onig

TL;DR
This paper demonstrates that data augmentation alone can match or surpass traditional regularization techniques like weight decay and dropout in deep learning, simplifying training and reducing hyperparameter tuning.
Contribution
It provides formal definitions of explicit and implicit regularization, and empirically shows data augmentation's effectiveness over weight decay and dropout in image classification tasks.
Findings
Data augmentation achieves comparable or better performance than weight decay and dropout.
Hyperparameter tuning of weight decay and dropout can harm performance if not carefully done.
Data augmentation provides significant generalization gains without additional hyperparameter tuning.
Abstract
Contrary to most machine learning models, modern deep artificial neural networks typically include multiple components that contribute to regularization. Despite the fact that some (explicit) regularization techniques, such as weight decay and dropout, require costly fine-tuning of sensitive hyperparameters, the interplay between them and other elements that provide implicit regularization is not well understood yet. Shedding light upon these interactions is key to efficiently using computational resources and may contribute to solving the puzzle of generalization in deep learning. Here, we first provide formal definitions of explicit and implicit regularization that help understand essential differences between techniques. Second, we contrast data augmentation with weight decay and dropout. Our results show that visual object categorization models trained with data augmentation alone…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Visual Attention and Saliency Detection
MethodsWeight Decay
