Fast Training of Convolutional Neural Networks via Kernel Rescaling

Pedro Porto Buarque de Gusm\~ao; Gianluca Francini; Skjalg Leps{\o}y,; Enrico Magli

arXiv:1610.03623·cs.CV·October 13, 2016·2 cites

Fast Training of Convolutional Neural Networks via Kernel Rescaling

Pedro Porto Buarque de Gusm\~ao, Gianluca Francini, Skjalg Leps{\o}y,, Enrico Magli

PDF

Open Access

TL;DR

This paper introduces a theoretically grounded method to accelerate CNN training by starting with lower-resolution kernels and images, then refining at full resolution, achieving nearly 20% faster training without accuracy loss.

Contribution

The authors propose a novel kernel rescaling technique that reduces CNN training time while maintaining accuracy, applicable to architectures like OverFeat and ResNet.

Findings

01

Training time reduced by nearly 20%

02

No loss in test accuracy observed

03

Applicable to multiple CNN architectures

Abstract

Training deep Convolutional Neural Networks (CNN) is a time consuming task that may take weeks to complete. In this article we propose a novel, theoretically founded method for reducing CNN training time without incurring any loss in accuracy. The basic idea is to begin training with a pre-train network using lower-resolution kernels and input images, and then refine the results at the full resolution by exploiting the spatial scaling property of convolutions. We apply our method to the ImageNet winner OverFeat and to the more recent ResNet architecture and show a reduction in training time of nearly 20% while test set accuracy is preserved in both cases.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Advanced Neural Network Applications · Machine Learning and ELM

MethodsAverage Pooling · Dropout · Dense Connections · Softmax · OverFeat · *Communicated@Fast*How Do I Communicate to Expedia? · 1x1 Convolution · Batch Normalization · Bottleneck Residual Block · Global Average Pooling