A Hardware-Aware System for Accelerating Deep Neural Network Optimization
Anthony Sarah, Daniel Cummings, Sharath Nittur Sridhar, Sairam, Sundaresan, Maciej Szankin, Tristan Webb, J. Pablo Munoz

TL;DR
This paper presents a hardware-aware system that efficiently finds optimized sub-networks from a pre-trained super-network, significantly reducing search time while maintaining performance across various hardware configurations.
Contribution
It introduces a comprehensive, hardware-aware sub-network search system that combines novel algorithms and predictors, enabling faster and more flexible neural network optimization without needing super-network refinement.
Findings
Achieved 8x faster search than Bayesian optimization WeakNAS.
Successfully applied to ResNet50, MobileNetV3, and Transformer.
Maintained diversity in Pareto front during optimization.
Abstract
Recent advances in Neural Architecture Search (NAS) which extract specialized hardware-aware configurations (a.k.a. "sub-networks") from a hardware-agnostic "super-network" have become increasingly popular. While considerable effort has been employed towards improving the first stage, namely, the training of the super-network, the search for derivative high-performing sub-networks is still largely under-explored. For example, some recent network morphism techniques allow a super-network to be trained once and then have hardware-specific networks extracted from it as needed. These methods decouple the super-network training from the sub-network search and thus decrease the computational burden of specializing to different hardware platforms. We propose a comprehensive system that automatically and efficiently finds sub-networks from a pre-trained super-network that are optimized to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Machine Learning and Data Classification · Domain Adaptation and Few-Shot Learning
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · ReLU6 · Average Pooling · Hard Swish · Depthwise Convolution · Pointwise Convolution · Depthwise Separable Convolution · *Communicated@Fast*How Do I Communicate to Expedia?
