Drop-Activation: Implicit Parameter Reduction and Harmonic   Regularization

Senwei Liang; Yuehaw Khoo; Haizhao Yang

arXiv:1811.05850·cs.LG·March 31, 2020·5 cites

Drop-Activation: Implicit Parameter Reduction and Harmonic Regularization

Senwei Liang, Yuehaw Khoo, Haizhao Yang

PDF

Open Access 2 Repos

TL;DR

Drop-Activation is a novel regularization technique that randomly drops nonlinear activations during training, acting as implicit parameter reduction and improving generalization in deep neural networks.

Contribution

It introduces Drop-Activation, a new regularization method that enhances generalization by dropping nonlinear activations randomly during training.

Findings

01

Improves performance on multiple image classification datasets.

02

Compatible with Batch Normalization and Auto Augment.

03

Acts as implicit parameter reduction.

Abstract

Overfitting frequently occurs in deep learning. In this paper, we propose a novel regularization method called Drop-Activation to reduce overfitting and improve generalization. The key idea is to drop nonlinear activation functions by setting them to be identity functions randomly during training time. During testing, we use a deterministic network with a new activation function to encode the average effect of dropping activations randomly. Our theoretical analyses support the regularization effect of Drop-Activation as implicit parameter reduction and verify its capability to be used together with Batch Normalization (Ioffe and Szegedy 2015). The experimental results on CIFAR-10, CIFAR-100, SVHN, EMNIST, and ImageNet show that Drop-Activation generally improves the performance of popular neural network architectures for the image classification task. Furthermore, as a regularizer…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning

MethodsBatch Normalization