Profiling Neural Blocks and Design Spaces for Mobile Neural Architecture Search
Keith G. Mills, Fred X. Han, Jialin Zhang, Seyed Saeed Changiz Rezaei,, Fabian Chudak, Wei Lu, Shuo Lian, Shangling Jui, Di Niu

TL;DR
This paper analyzes neural blocks and design spaces in neural architecture search to understand hardware compatibility, introducing a methodology to optimize search spaces for better accuracy and latency on diverse devices.
Contribution
It introduces a profiling methodology for neural blocks, enabling hardware-aware search space reduction that improves neural network performance.
Findings
Hardware-specific search spaces yield better accuracy-latency trade-offs.
Profiling neural blocks predicts inference latency across devices.
Hardware-aware search improves ImageNet top-1 scores.
Abstract
Neural architecture search automates neural network design and has achieved state-of-the-art results in many deep learning applications. While recent literature has focused on designing networks to maximize accuracy, little work has been conducted to understand the compatibility of architecture design spaces to varying hardware. In this paper, we analyze the neural blocks used to build Once-for-All (MobileNetV3), ProxylessNAS and ResNet families, in order to understand their predictive power and inference latency on various devices, including Huawei Kirin 9000 NPU, RTX 2080 Ti, AMD Threadripper 2990WX, and Samsung Note10. We introduce a methodology to quantify the friendliness of neural blocks to hardware and the impact of their placement in a macro network on overall network performance via only end-to-end measurements. Based on extensive profiling results, we derive design insights…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Cutout · 1x1 Convolution · DropPath · Batch Normalization · Residual Connection · Average Pooling · Max Pooling · Residual Block · Bottleneck Residual Block
