Profiling Neural Blocks and Design Spaces for Mobile Neural Architecture   Search

Keith G. Mills; Fred X. Han; Jialin Zhang; Seyed Saeed Changiz Rezaei,; Fabian Chudak; Wei Lu; Shuo Lian; Shangling Jui; Di Niu

arXiv:2109.12426·cs.LG·September 28, 2021

Profiling Neural Blocks and Design Spaces for Mobile Neural Architecture Search

Keith G. Mills, Fred X. Han, Jialin Zhang, Seyed Saeed Changiz Rezaei,, Fabian Chudak, Wei Lu, Shuo Lian, Shangling Jui, Di Niu

PDF

1 Repo 1 Datasets

TL;DR

This paper analyzes neural blocks and design spaces in neural architecture search to understand hardware compatibility, introducing a methodology to optimize search spaces for better accuracy and latency on diverse devices.

Contribution

It introduces a profiling methodology for neural blocks, enabling hardware-aware search space reduction that improves neural network performance.

Findings

01

Hardware-specific search spaces yield better accuracy-latency trade-offs.

02

Profiling neural blocks predicts inference latency across devices.

03

Hardware-aware search improves ImageNet top-1 scores.

Abstract

Neural architecture search automates neural network design and has achieved state-of-the-art results in many deep learning applications. While recent literature has focused on designing networks to maximize accuracy, little work has been conducted to understand the compatibility of architecture design spaces to varying hardware. In this paper, we analyze the neural blocks used to build Once-for-All (MobileNetV3), ProxylessNAS and ResNet families, in order to understand their predictive power and inference latency on various devices, including Huawei Kirin 9000 NPU, RTX 2080 Ti, AMD Threadripper 2990WX, and Samsung Note10. We introduce a methodology to quantify the friendliness of neural blocks to hardware and the impact of their placement in a macro network on overall network performance via only end-to-end measurements. Based on extensive profiling results, we derive design insights…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ascend-research/blockprofile
pytorchOfficial

Datasets

kgmills/blockprofile_cikm2021
dataset· 5 dl
5 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Methods*Communicated@Fast*How Do I Communicate to Expedia? · Cutout · 1x1 Convolution · DropPath · Batch Normalization · Residual Connection · Average Pooling · Max Pooling · Residual Block · Bottleneck Residual Block