NAS-Bench-x11 and the Power of Learning Curves
Shen Yan, Colin White, Yash Savani, Frank Hutter

TL;DR
This paper introduces surrogate NAS benchmarks that provide complete training curves for architectures, enabling advanced multi-fidelity NAS techniques like learning curve extrapolation, which improve search efficiency and performance.
Contribution
The authors develop surrogate benchmarks with full training information using SVD and noise modeling, facilitating learning curve extrapolation in NAS research.
Findings
Learning curve extrapolation improves NAS efficiency.
Surrogate benchmarks enable multi-fidelity NAS methods.
Full training data enhances architecture evaluation accuracy.
Abstract
While early research in neural architecture search (NAS) required extreme computational resources, the recent releases of tabular and surrogate benchmarks have greatly increased the speed and reproducibility of NAS research. However, two of the most popular benchmarks do not provide the full training information for each architecture. As a result, on these benchmarks it is not possible to run many types of multi-fidelity techniques, such as learning curve extrapolation, that require evaluating architectures at arbitrary epochs. In this work, we present a method using singular value decomposition and noise modeling to create surrogate benchmarks, NAS-Bench-111, NAS-Bench-311, and NAS-Bench-NLP11, that output the full training information for each architecture, rather than just the final validation accuracy. We demonstrate the power of using the full training information by introducing a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Neural Network Applications · Machine Learning in Materials Science
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
