SiGeo: Sub-One-Shot NAS via Information Theory and Geometry of Loss Landscape
Hua Zheng, Kuang-Hung Liu, Igor Fedorov, Xin Zhang and, Wen-Yen Chen, Wei Wen

TL;DR
SiGeo introduces a novel sub-one-shot NAS framework that leverages limited training data and information theory to efficiently evaluate neural architectures, outperforming existing proxies with reduced computational costs.
Contribution
The paper proposes a new sub-one-shot NAS paradigm and a theoretical proxy, SiGeo, that effectively bridges zero-shot and one-shot NAS, reducing training requirements and computational costs.
Findings
SiGeo outperforms state-of-the-art NAS proxies on benchmarks.
Warm-up with SiGeo achieves comparable performance to traditional one-shot NAS.
Sub-one-shot NAS reduces computational costs by approximately 60%.
Abstract
Neural Architecture Search (NAS) has become a widely used tool for automating neural network design. While one-shot NAS methods have successfully reduced computational requirements, they often require extensive training. On the other hand, zero-shot NAS utilizes training-free proxies to evaluate a candidate architecture's test performance but has two limitations: (1) inability to use the information gained as a network improves with training and (2) unreliable performance, particularly in complex domains like RecSys, due to the multi-modal data inputs and complex architecture configurations. To synthesize the benefits of both methods, we introduce a "sub-one-shot" paradigm that serves as a bridge between zero-shot and one-shot NAS. In sub-one-shot NAS, the supernet is trained using only a small subset of the training data, a phase we refer to as "warm-up." Within this framework, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Neural Networks and Applications · Machine Learning and Data Classification
