Refining Architectures of Deep Convolutional Neural Networks

Sukrit Shankar; Duncan Robertson; Yani Ioannou; Antonio Criminisi,; Roberto Cipolla

arXiv:1604.06832·cs.CV·April 26, 2016

Refining Architectures of Deep Convolutional Neural Networks

Sukrit Shankar, Duncan Robertson, Yani Ioannou, Antonio Criminisi,, Roberto Cipolla

PDF

TL;DR

This paper presents a novel method to refine CNN architectures by optimally stretching and splitting layers, aiming to improve accuracy and reduce size for specific datasets, demonstrated on natural scene attribute datasets.

Contribution

Introduces a new architecture refinement strategy using stretching and splitting operations, enhancing CNN performance tailored to particular datasets.

Findings

01

Refinement improves accuracy on SUN Attributes and CAMIT-NSAD datasets.

02

Method reduces model size while maintaining or improving accuracy.

03

Effective across different CNN architectures like GoogleNet and VGG-11.

Abstract

Deep Convolutional Neural Networks (CNNs) have recently evinced immense success for various image recognition tasks. However, a question of paramount importance is somewhat unanswered in deep learning research - is the selected CNN optimal for the dataset in terms of accuracy and model size? In this paper, we intend to answer this question and introduce a novel strategy that alters the architecture of a given CNN for a specified dataset, to potentially enhance the original accuracy while possibly reducing the model size. We use two operations for architecture refinement, viz. stretching and symmetrical splitting. Our procedure starts with a pre-trained CNN for a given dataset, and optimally decides the stretch and split factors across the network to refine the architecture. We empirically demonstrate the necessity of the two operations. We evaluate our approach on two natural scenes…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Methods1x1 Convolution · Convolution · Average Pooling · Local Response Normalization · Auxiliary Classifier · Inception Module · *Communicated@Fast*How Do I Communicate to Expedia? · Dropout · Dense Connections · Max Pooling