How to Train Your Super-Net: An Analysis of Training Heuristics in Weight-Sharing NAS
Kaicheng Yu, Rene Ranftl, Mathieu Salzmann

TL;DR
This paper systematically evaluates training heuristics and hyperparameters in weight-sharing neural architecture search, revealing their impact on super-net performance correlation and establishing a reproducible baseline for future research.
Contribution
It provides a comprehensive analysis of heuristics and hyperparameters in weight-sharing NAS, highlighting their effects and offering a solid baseline for future studies.
Findings
Some heuristics negatively affect super-net and stand-alone performance correlation.
Certain hyperparameters and architectural choices have a strong influence.
The study offers a reproducible baseline for future research in weight-sharing NAS.
Abstract
Weight sharing promises to make neural architecture search (NAS) tractable even on commodity hardware. Existing methods in this space rely on a diverse set of heuristics to design and train the shared-weight backbone network, a.k.a. the super-net. Since heuristics and hyperparameters substantially vary across different methods, a fair comparison between them can only be achieved by systematically analyzing the influence of these factors. In this paper, we therefore provide a systematic evaluation of the heuristics and hyperparameters that are frequently employed by weight-sharing NAS algorithms. Our analysis uncovers that some commonly-used heuristics for super-net training negatively impact the correlation between super-net and stand-alone performance, and evidences the strong influence of certain hyperparameters and architectural choices. Our code and experiments set a strong and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVehicle Routing Optimization Methods · Advanced Manufacturing and Logistics Optimization · Smart Parking Systems Research
MethodsSigmoid Activation · Tanh Activation · Softmax · Long Short-Term Memory
