Efficient Neural Architecture Search via Parameter Sharing
Hieu Pham, Melody Y. Guan, Barret Zoph, Quoc V. Le, Jeff Dean

TL;DR
ENAS introduces a fast, cost-effective neural architecture search method using parameter sharing, achieving state-of-the-art results on language modeling and competitive performance on image classification with significantly reduced computational resources.
Contribution
The paper presents ENAS, a novel neural architecture search approach that drastically reduces search time and cost through parameter sharing, outperforming or matching existing methods.
Findings
Achieves a test perplexity of 55.8 on Penn Treebank.
Attains a test error of 2.89% on CIFAR-10.
Reduces computational cost by 1000x compared to standard NAS.
Abstract
We propose Efficient Neural Architecture Search (ENAS), a fast and inexpensive approach for automatic model design. In ENAS, a controller learns to discover neural network architectures by searching for an optimal subgraph within a large computational graph. The controller is trained with policy gradient to select a subgraph that maximizes the expected reward on the validation set. Meanwhile the model corresponding to the selected subgraph is trained to minimize a canonical cross entropy loss. Thanks to parameter sharing between child models, ENAS is fast: it delivers strong empirical performances using much fewer GPU-hours than all existing automatic model design approaches, and notably, 1000x less expensive than standard Neural Architecture Search. On the Penn Treebank dataset, ENAS discovers a novel architecture that achieves a test perplexity of 55.8, establishing a new…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning
MethodsSigmoid Activation · Tanh Activation · Softmax · Long Short-Term Memory
