Generalized Dropout
Suraj Srinivas, R. Venkatesh Babu

TL;DR
This paper introduces Generalized Dropout, a flexible family of regularizers that extend traditional Dropout, including trainable parameters and layer width selection, leading to improved neural network generalization.
Contribution
The paper proposes Generalized Dropout, encompassing trainable dropout parameters and layer width adaptation, enhancing regularization beyond classical Dropout.
Findings
Generalized Dropout improves over standard Dropout in generalization.
Dropout++ with trainable parameters outperforms classical Dropout.
Layer width selection via Generalized Dropout benefits model performance.
Abstract
Deep Neural Networks often require good regularizers to generalize well. Dropout is one such regularizer that is widely used among Deep Learning practitioners. Recent work has shown that Dropout can also be viewed as performing Approximate Bayesian Inference over the network parameters. In this work, we generalize this notion and introduce a rich family of regularizers which we call Generalized Dropout. One set of methods in this family, called Dropout++, is a version of Dropout with trainable parameters. Classical Dropout emerges as a special case of this method. Another member of this family selects the width of neural network layers. Experiments show that these methods help in improving generalization performance over Dropout.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning
MethodsDropout
