Efficient Architecture Search by Network Transformation

Han Cai; Tianyao Chen; Weinan Zhang; Yong Yu; Jun Wang

arXiv:1707.04873·cs.LG·November 22, 2017·323 cites

Efficient Architecture Search by Network Transformation

Han Cai, Tianyao Chen, Weinan Zhang, Yong Yu, Jun Wang

PDF

Open Access 3 Repos

TL;DR

This paper introduces an efficient neural architecture search method that reuses network weights during exploration, significantly reducing computational costs while designing competitive CNNs on image benchmarks.

Contribution

It proposes a reinforcement learning framework that employs function-preserving transformations to grow networks, enabling weight reuse and efficient architecture search.

Findings

01

Achieved 4.23% test error on CIFAR-10 without skip-connections.

02

Designed networks outperform existing models with similar schemes.

03

Explored DenseNet space to find more accurate, parameter-efficient networks.

Abstract

Techniques for automatically designing deep neural network architectures such as reinforcement learning based approaches have recently shown promising results. However, their success is based on vast computational resources (e.g. hundreds of GPUs), making them difficult to be widely used. A noticeable limitation is that they still design and train each network from scratch during the exploration of the architecture space, which is highly inefficient. In this paper, we propose a new framework toward efficient architecture search by exploring the architecture space based on the current network and reusing its weights. We employ a reinforcement learning agent as the meta-controller, whose action is to grow the network depth or layer width with function-preserving transformations. As such, the previously validated networks can be reused for further exploration, thus saves a large amount of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning

Methods*Communicated@Fast*How Do I Communicate to Expedia? · Batch Normalization · Convolution · Average Pooling · Concatenated Skip Connection · Global Average Pooling · Dense Block · Kaiming Initialization · 1x1 Convolution · Dropout