Generalization Properties of NAS under Activation and Skip Connection Search
Zhenyu Zhu, Fanghui Liu, Grigorios G Chrysos, Volkan Cevher

TL;DR
This paper provides a theoretical analysis of Neural Architecture Search (NAS), deriving bounds on the Neural Tangent Kernel's eigenvalues to guide architecture selection and enable train-free NAS methods.
Contribution
It introduces a unifying theoretical framework for NAS that includes skip connections and activation functions, deriving eigenvalue bounds to predict generalization performance.
Findings
Eigenvalue bounds of NTK inform NAS generalization
Theoretical results guide architecture selection without training
Experimental validation supports the train-free NAS approach
Abstract
Neural Architecture Search (NAS) has fostered the automatic discovery of state-of-the-art neural architectures. Despite the progress achieved with NAS, so far there is little attention to theoretical guarantees on NAS. In this work, we study the generalization properties of NAS under a unifying framework enabling (deep) layer skip connection search and activation function search. To this end, we derive the lower (and upper) bounds of the minimum eigenvalue of the Neural Tangent Kernel (NTK) under the (in)finite-width regime using a certain search space including mixed activation functions, fully connected, and residual neural networks. We use the minimum eigenvalue to establish generalization error bounds of NAS in the stochastic gradient descent training. Importantly, we theoretically and experimentally show how the derived results can guide NAS to select the top-performing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Neural Network Applications · Neural Networks and Applications · Domain Adaptation and Few-Shot Learning
MethodsNeural Tangent Kernel
