Dropout as data augmentation
Xavier Bouthillier, Kishore Konda, Pascal Vincent, Roland Memisevic

TL;DR
This paper reinterprets dropout as a form of data augmentation in input space, introduces a method to project dropout noise back into inputs, and proposes a new dropout scheme that enhances performance efficiently.
Contribution
It presents a novel perspective on dropout as input space data augmentation, along with a method to generate augmented data and a new dropout scheme improving results.
Findings
Projected dropout noise can generate effective augmented data.
Training on augmented data yields similar or better results.
New dropout scheme improves performance without extra computational cost.
Abstract
Dropout is typically interpreted as bagging a large number of models sharing parameters. We show that using dropout in a network can also be interpreted as a kind of data augmentation in the input space without domain knowledge. We present an approach to projecting the dropout noise within a network back into the input space, thereby generating augmented versions of the training data, and we show that training a deterministic network on the augmented samples yields similar results. Finally, we propose a new dropout noise scheme based on our observations and show that it improves dropout results without adding significant computational cost.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Anomaly Detection Techniques and Applications · Adversarial Robustness in Machine Learning
MethodsDropout
