TL;DR
This paper introduces a method to learn neural network architectures directly on datasets, transferring designs from small to large datasets, achieving state-of-the-art accuracy with reduced computational costs.
Contribution
The work proposes a new search space and transfer learning approach for neural architecture search, along with a regularization technique, improving image recognition performance.
Findings
Achieved 2.4% error on CIFAR-10, state-of-the-art.
Attained 82.7% top-1 accuracy on ImageNet, state-of-the-art.
Reduced computational demand by 28% compared to previous models.
Abstract
Developing neural network image classification models often requires significant architecture engineering. In this paper, we study a method to learn the model architectures directly on the dataset of interest. As this approach is expensive when the dataset is large, we propose to search for an architectural building block on a small dataset and then transfer the block to a larger dataset. The key contribution of this work is the design of a new search space (the "NASNet search space") which enables transferability. In our experiments, we search for the best convolutional layer (or "cell") on the CIFAR-10 dataset and then apply this cell to the ImageNet dataset by stacking together more copies of this cell, each with their own parameters to design a convolutional architecture, named "NASNet architecture". We also introduce a new regularization technique called ScheduledDropPath that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsNeural Architecture Search · Sigmoid Activation · Tanh Activation · Entropy Regularization · Proximal Policy Optimization · Exponential Decay · Instance Normalization · Layer Normalization · Dropout · RMSProp
