Generalization Guarantees for Neural Architecture Search with   Train-Validation Split

Samet Oymak; Mingchen Li; Mahdi Soltanolkotabi

arXiv:2104.14132·stat.ML·March 4, 2022·5 cites

Generalization Guarantees for Neural Architecture Search with Train-Validation Split

Samet Oymak, Mingchen Li, Mahdi Soltanolkotabi

PDF

Open Access 1 Video

TL;DR

This paper analyzes the statistical properties of neural architecture search with train-validation splits, showing how validation metrics can guide generalization and proposing bounds and methods for effective NAS.

Contribution

It provides new theoretical insights into NAS generalization, bounds for gradient-based search, and connections to kernel and matrix learning methods.

Findings

01

Validation loss properties indicate true test loss.

02

Gradient descent finds optimal architecture even with zero training error.

03

Spectral methods can efficiently solve the outer NAS problem.

Abstract

Neural Architecture Search (NAS) is a popular method for automatically designing optimized architectures for high-performance deep learning. In this approach, it is common to use bilevel optimization where one optimizes the model weights over the training data (inner problem) and various hyperparameters such as the configuration of the architecture over the validation data (outer problem). This paper explores the statistical aspects of such problems with train-validation splits. In practice, the inner problem is often overparameterized and can easily achieve zero loss. Thus, a-priori it seems impossible to distinguish the right hyperparameters based on training loss alone which motivates a better understanding of the role of train-validation split. To this aim this work establishes the following results. (1) We show that refined properties of the validation loss such as risk and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Generalization Guarantees for Neural Architecture Search with Train-Validation Split· slideslive

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Machine Learning and Algorithms · Advanced Neural Network Applications