Fast Training of Convolutional Neural Networks via Kernel Rescaling
Pedro Porto Buarque de Gusm\~ao, Gianluca Francini, Skjalg Leps{\o}y,, Enrico Magli

TL;DR
This paper introduces a theoretically grounded method to accelerate CNN training by starting with lower-resolution kernels and images, then refining at full resolution, achieving nearly 20% faster training without accuracy loss.
Contribution
The authors propose a novel kernel rescaling technique that reduces CNN training time while maintaining accuracy, applicable to architectures like OverFeat and ResNet.
Findings
Training time reduced by nearly 20%
No loss in test accuracy observed
Applicable to multiple CNN architectures
Abstract
Training deep Convolutional Neural Networks (CNN) is a time consuming task that may take weeks to complete. In this article we propose a novel, theoretically founded method for reducing CNN training time without incurring any loss in accuracy. The basic idea is to begin training with a pre-train network using lower-resolution kernels and input images, and then refine the results at the full resolution by exploiting the spatial scaling property of convolutions. We apply our method to the ImageNet winner OverFeat and to the more recent ResNet architecture and show a reduction in training time of nearly 20% while test set accuracy is preserved in both cases.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Advanced Neural Network Applications · Machine Learning and ELM
MethodsAverage Pooling · Dropout · Dense Connections · Softmax · OverFeat · *Communicated@Fast*How Do I Communicate to Expedia? · 1x1 Convolution · Batch Normalization · Bottleneck Residual Block · Global Average Pooling
