Simple And Efficient Architecture Search for Convolutional Neural Networks
Thomas Elsken, Jan-Hendrik Metzen, Frank Hutter

TL;DR
This paper introduces a simple, resource-efficient method for automatically designing CNN architectures using hill climbing and network morphisms, achieving competitive results with minimal computational resources.
Contribution
A novel, straightforward architecture search method combining hill climbing and network morphisms that is computationally efficient and effective.
Findings
Achieves below 6% error on CIFAR-10 in 12 hours on a single GPU.
Further training reduces error to nearly 5%.
Method requires resources comparable to training a single network.
Abstract
Neural networks have recently had a lot of success for many tasks. However, neural network architectures that perform well are still typically designed manually by experts in a cumbersome trial-and-error process. We propose a new method to automatically search for well-performing CNN architectures based on a simple hill climbing procedure whose operators apply network morphisms, followed by short optimization runs by cosine annealing. Surprisingly, this simple method yields competitive results, despite only requiring resources in the same order of magnitude as training a single network. E.g., on CIFAR-10, our method designs and trains networks with an error rate below 6% in only 12 hours on a single GPU; training for one day reduces this error further, to almost 5%.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Human Pose and Action Recognition
MethodsSoftmax · Neural Architecture Search
