Path-Level Network Transformation for Efficient Architecture Search

Han Cai; Jiacheng Yang; Weinan Zhang; Song Han; Yong Yu

arXiv:1806.02639·cs.LG·June 8, 2018·118 cites

Path-Level Network Transformation for Efficient Architecture Search

Han Cai, Jiacheng Yang, Weinan Zhang, Song Han, Yong Yu

PDF

Open Access 3 Repos

TL;DR

This paper presents a novel path-level network transformation method that enhances neural architecture search by enabling topology modifications while reusing trained weights, leading to more efficient and effective model design.

Contribution

The authors introduce a path-level transformation operation and a bidirectional tree-structured reinforcement learning controller for flexible architecture search.

Findings

01

Achieved 97.70% accuracy on CIFAR-10 with 14.3M parameters.

02

Obtained 74.6% top-1 accuracy on ImageNet in mobile setting.

03

Demonstrated improved parameter efficiency and transferability.

Abstract

We introduce a new function-preserving transformation for efficient neural architecture search. This network transformation allows reusing previously trained networks and existing successful architectures that improves sample efficiency. We aim to address the limitation of current network transformation operations that can only perform layer-level architecture modifications, such as adding (pruning) filters or inserting (removing) a layer, which fails to change the topology of connection paths. Our proposed path-level transformation operations enable the meta-controller to modify the path topology of the given network while keeping the merits of reusing weights, and thus allow efficiently designing effective structures with complex path topologies like Inception models. We further propose a bidirectional tree-structured reinforcement learning meta-controller to explore a simple yet…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning