GradAlign for Training-free Model Performance Inference
Yuxuan Li, Yunhui Guo

TL;DR
GradAlign is a training-free method that predicts neural network performance by measuring gradient conflicts at initialization, outperforming existing metrics like linear region count in architecture selection.
Contribution
Introduces GradAlign, a novel training-free performance inference technique based on gradient conflict analysis, improving architecture selection without training.
Findings
GradAlign outperforms existing training-free NAS methods on standard benchmarks.
Gradient conflicts at initialization correlate with final model performance.
Linear region count is not a reliable metric for architecture selection.
Abstract
Architecture plays an important role in deciding the performance of deep neural networks. However, the search for the optimal architecture is often hindered by the vast search space, making it a time-intensive process. Recently, a novel approach known as training-free neural architecture search (NAS) has emerged, aiming to discover the ideal architecture without necessitating extensive training. Training-free NAS leverages various indicators for architecture selection, including metrics such as the count of linear regions, the density of per-sample losses, and the stability of the finite-width Neural Tangent Kernel (NTK) matrix. Despite the competitive empirical performance of current training-free NAS techniques, they suffer from certain limitations, including inconsistent performance and a lack of deep understanding. In this paper, we introduce GradAlign, a simple yet effective method…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Machine Learning in Healthcare
