On the Implicit Bias of Dropout
Poorya Mianjy, Raman Arora, Rene Vidal

TL;DR
This paper investigates the implicit bias introduced by dropout in deep learning, revealing how it equalizes weight norms in single-layer linear networks and characterizing the associated optimization landscape.
Contribution
It provides a theoretical analysis of dropout's implicit bias, specifically showing norm equalization and detailing the optimization landscape for single-layer linear neural networks.
Findings
Dropout induces equal norms in incoming and outgoing weights of hidden nodes.
Complete characterization of the dropout-induced optimization landscape.
Insights into how dropout contributes to generalization in over-parametrized models.
Abstract
Algorithmic approaches endow deep learning systems with implicit bias that helps them generalize even in over-parametrized settings. In this paper, we focus on understanding such a bias induced in learning through dropout, a popular technique to avoid overfitting in deep learning. For single hidden-layer linear neural networks, we show that dropout tends to make the norm of incoming/outgoing weight vectors of all the hidden nodes equal. In addition, we provide a complete characterization of the optimization landscape induced by dropout.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Machine Learning and ELM · Neural Networks and Applications
MethodsDropout
