Efficient Architecture Search by Network Transformation
Han Cai, Tianyao Chen, Weinan Zhang, Yong Yu, Jun Wang

TL;DR
This paper introduces an efficient neural architecture search method that reuses network weights during exploration, significantly reducing computational costs while designing competitive CNNs on image benchmarks.
Contribution
It proposes a reinforcement learning framework that employs function-preserving transformations to grow networks, enabling weight reuse and efficient architecture search.
Findings
Achieved 4.23% test error on CIFAR-10 without skip-connections.
Designed networks outperform existing models with similar schemes.
Explored DenseNet space to find more accurate, parameter-efficient networks.
Abstract
Techniques for automatically designing deep neural network architectures such as reinforcement learning based approaches have recently shown promising results. However, their success is based on vast computational resources (e.g. hundreds of GPUs), making them difficult to be widely used. A noticeable limitation is that they still design and train each network from scratch during the exploration of the architecture space, which is highly inefficient. In this paper, we propose a new framework toward efficient architecture search by exploring the architecture space based on the current network and reusing its weights. We employ a reinforcement learning agent as the meta-controller, whose action is to grow the network depth or layer width with function-preserving transformations. As such, the previously validated networks can be reused for further exploration, thus saves a large amount of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Batch Normalization · Convolution · Average Pooling · Concatenated Skip Connection · Global Average Pooling · Dense Block · Kaiming Initialization · 1x1 Convolution · Dropout
