SNAS: Stochastic Neural Architecture Search
Sirui Xie, Hehui Zheng, Chunxiao Liu, Liang Lin

TL;DR
SNAS introduces a differentiable, end-to-end neural architecture search method that efficiently optimizes architecture parameters using gradient-based techniques, achieving state-of-the-art results on CIFAR-10 and transferability to ImageNet.
Contribution
The paper presents a novel stochastic formulation of NAS that allows simultaneous training of architecture and operation parameters via backpropagation, improving efficiency and effectiveness.
Findings
SNAS finds high-performing architectures faster than previous methods.
SNAS achieves state-of-the-art accuracy on CIFAR-10 with fewer epochs.
SNAS's architectures transfer well to ImageNet.
Abstract
We propose Stochastic Neural Architecture Search (SNAS), an economical end-to-end solution to Neural Architecture Search (NAS) that trains neural operation parameters and architecture distribution parameters in same round of back-propagation, while maintaining the completeness and differentiability of the NAS pipeline. In this work, NAS is reformulated as an optimization problem on parameters of a joint distribution for the search space in a cell. To leverage the gradient information in generic differentiable loss for architecture search, a novel search gradient is proposed. We prove that this search gradient optimizes the same objective as reinforcement-learning-based NAS, but assigns credits to structural decisions more efficiently. This credit assignment is further augmented with locally decomposable reward to enforce a resource-efficient constraint. In experiments on CIFAR-10, SNAS…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning
MethodsSigmoid Activation · Tanh Activation · Softmax · Long Short-Term Memory
