Towards Automated Deep Learning: Efficient Joint Neural Architecture and Hyperparameter Search
Arber Zela, Aaron Klein, Stefan Falkner, Frank Hutter

TL;DR
This paper introduces a joint neural architecture and hyperparameter search method using Bayesian optimization and Hyperband, addressing inefficiencies and suboptimality in traditional separate tuning approaches.
Contribution
It proposes a novel combined search approach that optimizes architecture and hyperparameters simultaneously, improving efficiency and performance over existing methods.
Findings
Joint search outperforms separate tuning in accuracy.
Using Bayesian optimization with Hyperband reduces search time.
The method finds better configurations than traditional NAS.
Abstract
While existing work on neural architecture search (NAS) tunes hyperparameters in a separate post-processing step, we demonstrate that architectural choices and other hyperparameter settings interact in a way that can render this separation suboptimal. Likewise, we demonstrate that the common practice of using very few epochs during the main NAS and much larger numbers of epochs during a post-processing step is inefficient due to little correlation in the relative rankings for these two training regimes. To combat both of these problems, we propose to use a recent combination of Bayesian optimization and Hyperband for efficient joint neural architecture and hyperparameter search.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Advanced Neural Network Applications · Machine Learning and Algorithms
MethodsSigmoid Activation · Tanh Activation · Softmax · Long Short-Term Memory
