Effective and Efficient Dropout for Deep Convolutional Neural Networks

Shaofeng Cai; Yao Shu; Gang Chen; Beng Chin Ooi; Wei Wang; Meihui; Zhang

arXiv:1904.03392·cs.LG·July 29, 2020·55 cites

Effective and Efficient Dropout for Deep Convolutional Neural Networks

Shaofeng Cai, Yao Shu, Gang Chen, Beng Chin Ooi, Wei Wang, Meihui, Zhang

PDF

Open Access

TL;DR

This paper investigates the limitations of standard dropout in CNNs, proposes new dropout variants and placement strategies to improve regularization, and demonstrates significant performance gains on benchmark datasets.

Contribution

It introduces novel dropout variants and placement methods that better integrate with CNNs, addressing conflicts with Batch Normalization and enhancing regularization effectiveness.

Findings

01

Dropout placement before convolution improves regularization.

02

Replacing Batch Normalization with Group Normalization reduces conflicts.

03

Proposed Drop-Conv2d variant enhances CNN performance.

Abstract

Convolutional Neural networks (CNNs) based applications have become ubiquitous, where proper regularization is greatly needed. To prevent large neural network models from overfitting, dropout has been widely used as an efficient regularization technique in practice. However, many recent works show that the standard dropout is ineffective or even detrimental to the training of CNNs. In this paper, we revisit this issue and examine various dropout variants in an attempt to improve existing dropout-based regularization techniques for CNNs. We attribute the failure of standard dropout to the conflict between the stochasticity of dropout and its following Batch Normalization (BN), and propose to reduce the conflict by placing dropout operations right before the convolutional operation instead of BN, or totally address this issue by replacing BN with Group Normalization (GN). We further…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Machine Learning and ELM · Adversarial Robustness in Machine Learning

MethodsDropout · Batch Normalization