A Hardware-Aware System for Accelerating Deep Neural Network   Optimization

Anthony Sarah; Daniel Cummings; Sharath Nittur Sridhar; Sairam; Sundaresan; Maciej Szankin; Tristan Webb; J. Pablo Munoz

arXiv:2202.12954·cs.AI·March 1, 2022·1 cites

A Hardware-Aware System for Accelerating Deep Neural Network Optimization

Anthony Sarah, Daniel Cummings, Sharath Nittur Sridhar, Sairam, Sundaresan, Maciej Szankin, Tristan Webb, J. Pablo Munoz

PDF

Open Access

TL;DR

This paper presents a hardware-aware system that efficiently finds optimized sub-networks from a pre-trained super-network, significantly reducing search time while maintaining performance across various hardware configurations.

Contribution

It introduces a comprehensive, hardware-aware sub-network search system that combines novel algorithms and predictors, enabling faster and more flexible neural network optimization without needing super-network refinement.

Findings

01

Achieved 8x faster search than Bayesian optimization WeakNAS.

02

Successfully applied to ResNet50, MobileNetV3, and Transformer.

03

Maintained diversity in Pareto front during optimization.

Abstract

Recent advances in Neural Architecture Search (NAS) which extract specialized hardware-aware configurations (a.k.a. "sub-networks") from a hardware-agnostic "super-network" have become increasingly popular. While considerable effort has been employed towards improving the first stage, namely, the training of the super-network, the search for derivative high-performing sub-networks is still largely under-explored. For example, some recent network morphism techniques allow a super-network to be trained once and then have hardware-specific networks extracted from it as needed. These methods decouple the super-network training from the sub-network search and thus decrease the computational burden of specializing to different hardware platforms. We propose a comprehensive system that automatically and efficiently finds sub-networks from a pre-trained super-network that are optimized to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Machine Learning and Data Classification · Domain Adaptation and Few-Shot Learning

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · ReLU6 · Average Pooling · Hard Swish · Depthwise Convolution · Pointwise Convolution · Depthwise Separable Convolution · *Communicated@Fast*How Do I Communicate to Expedia?