Regularizing Deep Neural Networks by Noise: Its Interpretation and Optimization
Hyeonwoo Noh, Tackgeun You, Jonghwan Mun, Bohyung Han

TL;DR
This paper provides a theoretical interpretation of noise-based regularization in deep neural networks, showing it optimizes a lower bound of the true objective, and proposes a method to tighten this bound for improved generalization.
Contribution
It introduces a novel interpretation of noise injection as optimizing a lower bound and proposes a technique with multiple noise samples to enhance regularization effectiveness.
Findings
Noise injection regularization optimizes a lower bound of the true objective.
Using multiple noise samples tightens the lower bound and improves generalization.
The proposed method shows effectiveness in computer vision tasks.
Abstract
Overfitting is one of the most critical challenges in deep neural networks, and there are various types of regularization methods to improve generalization performance. Injecting noises to hidden units during training, e.g., dropout, is known as a successful regularizer, but it is still not clear enough why such training techniques work well in practice and how we can maximize their benefit in the presence of two conflicting objectives---optimizing to true data distribution and preventing overfitting by regularization. This paper addresses the above issues by 1) interpreting that the conventional training methods with regularization by noise injection optimize the lower bound of the true objective and 2) proposing a technique to achieve a tighter lower bound using multiple noise samples per training example in a stochastic gradient descent iteration. We demonstrate the effectiveness of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Stochastic Gradient Optimization Techniques
