Accelerating Neural Architecture Search using Performance Prediction
Bowen Baker, Otkrist Gupta, Ramesh Raskar, Nikhil Naik

TL;DR
This paper introduces simple, fast frequentist regression models to predict neural network performance early, enabling up to 6x faster hyperparameter optimization and architecture search without sacrificing accuracy.
Contribution
The paper presents a novel, effective early stopping method using performance prediction models that outperform Bayesian models in speed and simplicity, applicable across various domains and architectures.
Findings
Performance prediction models outperform Bayesian counterparts in accuracy.
Early stopping method achieves up to 6x speedup in hyperparameter search.
Models generalize across different tasks and architecture classes.
Abstract
Methods for neural network hyperparameter optimization and meta-modeling are computationally expensive due to the need to train a large number of model configurations. In this paper, we show that standard frequentist regression models can predict the final performance of partially trained model configurations using features based on network architectures, hyperparameters, and time-series validation performance data. We empirically show that our performance prediction models are much more effective than prominent Bayesian counterparts, are simpler to implement, and are faster to train. Our models can predict final performance in both visual classification and language modeling domains, are effective for predicting performance of drastically varying model architectures, and can even generalize between model classes. Using these prediction models, we also propose an early stopping method…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Advanced Neural Network Applications · Data Stream Mining Techniques
MethodsEarly Stopping
