RandomOut: Using a convolutional gradient norm to rescue convolutional filters
Joseph Paul Cohen, Henry Z. Lo, Wei Ding

TL;DR
This paper introduces RandomOut, a method that uses gradient norms to identify and reinitialize less impactful convolutional filters, reducing randomness effects and improving training consistency and accuracy in CNNs.
Contribution
The paper proposes a novel filter reinitialization technique based on gradient norms to enhance CNN training stability and performance across different random initializations.
Findings
Median accuracy increase of +3.3% on Inception-V3 without Batch Normalization.
More consistent generalization performance with lower standard deviation across seeds.
Faster training with fewer parameters needed.
Abstract
Filters in convolutional neural networks are sensitive to their initialization. The random numbers used to initialize filters are a bias and determine if you will "win" and converge to a satisfactory local minimum so we call this The Filter Lottery. We observe that the 28x28 Inception-V3 model without Batch Normalization fails to train 26% of the time when varying the random seed alone. This is a problem that affects the trial and error process of designing a network. Because random seeds have a large impact it makes it hard to evaluate a network design without trying many different random starting weights. This work aims to reduce the bias imposed by the initial weights so a network converges more consistently. We propose to evaluate and replace specific convolutional filters that have little impact on the prediction. We use the gradient norm to evaluate the impact of a filter on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Generative Adversarial Networks and Image Synthesis
MethodsAverage Pooling · Auxiliary Classifier · 1x1 Convolution · RMSProp · Inception-v3 Module · Max Pooling · Softmax · Convolution · Dropout · Dense Connections
