ADWPNAS: Architecture-Driven Weight Prediction for Neural Architecture   Search

XuZhang; ChenjunZhou; BoGu

arXiv:2003.01335·cs.NE·March 4, 2020·1 cites

ADWPNAS: Architecture-Driven Weight Prediction for Neural Architecture Search

XuZhang, ChenjunZhou, BoGu

PDF

Open Access

TL;DR

This paper introduces ADWPNAS, a neural architecture search method that predicts model weights using a HyperNetwork, enabling fast evaluation of architectures without finetuning, and achieves state-of-the-art results efficiently.

Contribution

The paper presents a novel architecture-driven weight prediction approach for NAS that significantly reduces search time and improves model performance.

Findings

01

Search procedure completes in 4.0 GPU hours on CIFAR-10.

02

Discovered model achieves 2.41% test error with 1.52M parameters.

03

Method outperforms existing models in efficiency and accuracy.

Abstract

How to discover and evaluate the true strength of models quickly and accurately is one of the key challenges in Neural Architecture Search (NAS). To cope with this problem, we propose an Architecture-Driven Weight Prediction (ADWP) approach for neural architecture search (NAS). In our approach, we first design an architecture-intensive search space and then train a HyperNetwork by inputting stochastic encoding architecture parameters. In the trained HyperNetwork, weights of convolution kernels can be well predicted for neural architectures in the search space. Consequently, the target architectures can be evaluated efficiently without any finetuning, thus enabling us to search fortheoptimalarchitectureinthespaceofgeneralnetworks (macro-search). Through real experiments, we evaluate the performance of the models discovered by the proposed AD-WPNAS and results show that one search…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning

MethodsHyperNetwork · Sigmoid Activation · Tanh Activation · Softmax · Long Short-Term Memory · Convolution