Regularizing Deep Neural Networks by Noise: Its Interpretation and   Optimization

Hyeonwoo Noh; Tackgeun You; Jonghwan Mun; Bohyung Han

arXiv:1710.05179·cs.LG·November 10, 2017·78 cites

Regularizing Deep Neural Networks by Noise: Its Interpretation and Optimization

Hyeonwoo Noh, Tackgeun You, Jonghwan Mun, Bohyung Han

PDF

Open Access

TL;DR

This paper provides a theoretical interpretation of noise-based regularization in deep neural networks, showing it optimizes a lower bound of the true objective, and proposes a method to tighten this bound for improved generalization.

Contribution

It introduces a novel interpretation of noise injection as optimizing a lower bound and proposes a technique with multiple noise samples to enhance regularization effectiveness.

Findings

01

Noise injection regularization optimizes a lower bound of the true objective.

02

Using multiple noise samples tightens the lower bound and improves generalization.

03

The proposed method shows effectiveness in computer vision tasks.

Abstract

Overfitting is one of the most critical challenges in deep neural networks, and there are various types of regularization methods to improve generalization performance. Injecting noises to hidden units during training, e.g., dropout, is known as a successful regularizer, but it is still not clear enough why such training techniques work well in practice and how we can maximize their benefit in the presence of two conflicting objectives---optimizing to true data distribution and preventing overfitting by regularization. This paper addresses the above issues by 1) interpreting that the conventional training methods with regularization by noise injection optimize the lower bound of the true objective and 2) proposing a technique to achieve a tighter lower bound using multiple noise samples per training example in a stochastic gradient descent iteration. We demonstrate the effectiveness of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Stochastic Gradient Optimization Techniques