Trainless Model Performance Estimation for Neural Architecture Search
Ekaterina Gracheva

TL;DR
This paper proposes a new method for neural architecture search that estimates architecture performance based on the stability of untrained accuracy across initializations, reducing the need for training architectures.
Contribution
It introduces the coefficient of variation of untrained accuracy ($CV_{U}$) as a novel scoring metric for architecture selection, demonstrating its effectiveness across multiple datasets.
Findings
Lower $CV_{U}$ correlates with higher trained accuracy.
The method significantly reduces search time compared to traditional training-based approaches.
Architectures with minimal $CV_{U}$ achieve above-baseline performance on CIFAR-10, CIFAR-100, and ImageNet.
Abstract
Neural architecture search has become an indispensable part of the deep learning field. Modern methods allow to find one of the best performing architectures, or to build one from scratch, but they typically make decisions based on the trained accuracy information. In the present article we explore instead how the architectural component of a neural network affects its prediction power. We focus on relationships between the trained accuracy of an architecture and its accuracy prior to training, by considering statistics over multiple initialisations. We observe that minimising the coefficient of variation of the untrained accuracy, , consistently leads to better performing architectures. We test the as a neural architecture search scoring metric using the NAS-Bench-201 database of trained neural architectures. The architectures with the lowest value have on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning
