SNAS: Stochastic Neural Architecture Search

Sirui Xie; Hehui Zheng; Chunxiao Liu; Liang Lin

arXiv:1812.09926·cs.LG·April 2, 2020·285 cites

SNAS: Stochastic Neural Architecture Search

Sirui Xie, Hehui Zheng, Chunxiao Liu, Liang Lin

PDF

Open Access 2 Repos

TL;DR

SNAS introduces a differentiable, end-to-end neural architecture search method that efficiently optimizes architecture parameters using gradient-based techniques, achieving state-of-the-art results on CIFAR-10 and transferability to ImageNet.

Contribution

The paper presents a novel stochastic formulation of NAS that allows simultaneous training of architecture and operation parameters via backpropagation, improving efficiency and effectiveness.

Findings

01

SNAS finds high-performing architectures faster than previous methods.

02

SNAS achieves state-of-the-art accuracy on CIFAR-10 with fewer epochs.

03

SNAS's architectures transfer well to ImageNet.

Abstract

We propose Stochastic Neural Architecture Search (SNAS), an economical end-to-end solution to Neural Architecture Search (NAS) that trains neural operation parameters and architecture distribution parameters in same round of back-propagation, while maintaining the completeness and differentiability of the NAS pipeline. In this work, NAS is reformulated as an optimization problem on parameters of a joint distribution for the search space in a cell. To leverage the gradient information in generic differentiable loss for architecture search, a novel search gradient is proposed. We prove that this search gradient optimizes the same objective as reinforcement-learning-based NAS, but assigns credits to structural decisions more efficiently. This credit assignment is further augmented with locally decomposable reward to enforce a resource-efficient constraint. In experiments on CIFAR-10, SNAS…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning

MethodsSigmoid Activation · Tanh Activation · Softmax · Long Short-Term Memory