The Implicit and Explicit Regularization Effects of Dropout

Colin Wei; Sham Kakade; Tengyu Ma

arXiv:2002.12915·cs.LG·October 16, 2020·27 cites

The Implicit and Explicit Regularization Effects of Dropout

Colin Wei, Sham Kakade, Tengyu Ma

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper analyzes how dropout regularizes neural networks through explicit modifications to the training objective and implicit stochastic effects, providing analytic characterizations that replicate dropout's benefits.

Contribution

It disentangles the explicit and implicit regularization effects of dropout and derives analytic simplifications that accurately model these effects.

Findings

01

Explicit and implicit effects of dropout are distinguishable and quantifiable.

02

Analytic regularizers derived from the effects can replace dropout effectively.

03

The implicit effect is similar to stochasticity in mini-batch SGD.

Abstract

Dropout is a widely-used regularization technique, often required to obtain state-of-the-art for a number of architectures. This work demonstrates that dropout introduces two distinct but entangled regularization effects: an explicit effect (also studied in prior work) which occurs since dropout modifies the expected training objective, and, perhaps surprisingly, an additional implicit effect from the stochasticity in the dropout training update. This implicit regularization effect is analogous to the effect of stochasticity in small mini-batch stochastic gradient descent. We disentangle these two effects through controlled experiments. We then derive analytic simplifications which characterize each effect in terms of the derivatives of the model and the loss, for deep neural networks. We demonstrate these simplified, analytic regularizers accurately capture the important aspects of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

cwein3/dropout-analytical
pytorchOfficial

Videos

The Implicit and Explicit Regularization Effects of Dropout· slideslive

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Adversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning

MethodsDropout