Searching for A Robust Neural Architecture in Four GPU Hours

Xuanyi Dong; Yi Yang

arXiv:1910.04465·cs.CV·October 17, 2019·1 cites

Searching for A Robust Neural Architecture in Four GPU Hours

Xuanyi Dong, Yi Yang

PDF

Open Access 5 Repos

TL;DR

This paper introduces GDAS, a gradient-based neural architecture search method that efficiently finds high-performing models in just four GPU hours, significantly reducing search time compared to previous approaches.

Contribution

The paper presents a novel differentiable architecture search method using a DAG representation and a learnable sampler, enabling end-to-end training with gradient descent.

Findings

01

Search completed in four GPU hours on CIFAR-10

02

Discovered model achieves 2.82% test error

03

Model has only 2.5 million parameters

Abstract

Conventional neural architecture search (NAS) approaches are based on reinforcement learning or evolutionary strategy, which take more than 3000 GPU hours to find a good model on CIFAR-10. We propose an efficient NAS approach learning to search by gradient descent. Our approach represents the search space as a directed acyclic graph (DAG). This DAG contains billions of sub-graphs, each of which indicates a kind of neural architecture. To avoid traversing all the possibilities of the sub-graphs, we develop a differentiable sampler over the DAG. This sampler is learnable and optimized by the validation loss after training the sampled architecture. In this way, our approach can be trained in an end-to-end fashion by gradient descent, named Gradient-based search using Differentiable Architecture Sampler (GDAS). In experiments, we can finish one searching procedure in four GPU hours on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques

MethodsTest · Sigmoid Activation · Tanh Activation · Softmax · Long Short-Term Memory