Regularization in ResNet with Stochastic Depth
Soufiane Hayou, Fadhel Ayed

TL;DR
This paper analyzes the regularization effects of Stochastic Depth in ResNets using a hybrid theoretical approach, providing guidelines for selecting survival rates to improve model generalization.
Contribution
It offers a novel hybrid analysis combining perturbation and signal propagation to understand SD's regularization effects and guides optimal survival rate choices.
Findings
Provides theoretical insights into SD's regularization mechanisms
Derives guidelines for choosing survival rates in SD training
Enhances understanding of SD's impact on ResNet generalization
Abstract
Regularization plays a major role in modern deep learning. From classic techniques such as L1,L2 penalties to other noise-based methods such as Dropout, regularization often yields better generalization properties by avoiding overfitting. Recently, Stochastic Depth (SD) has emerged as an alternative regularization technique for residual neural networks (ResNets) and has proven to boost the performance of ResNet on many tasks [Huang et al., 2016]. Despite the recent success of SD, little is known about this technique from a theoretical perspective. This paper provides a hybrid analysis combining perturbation analysis and signal propagation to shed light on different regularization effects of SD. Our analysis allows us to derive principled guidelines for choosing the survival rates used for training with SD.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Batch Normalization · Convolution · Residual Connection · Average Pooling · Global Average Pooling · Kaiming Initialization · 1x1 Convolution · Dropout · Residual Block
