One Proxy Device Is Enough for Hardware-Aware Neural Architecture Search

Bingqian Lu; Jianyi Yang; Weiwen Jiang; Yiyu Shi; Shaolei; Ren

arXiv:2111.01203·cs.LG·November 4, 2021

One Proxy Device Is Enough for Hardware-Aware Neural Architecture Search

Bingqian Lu, Jianyi Yang, Weiwen Jiang, Yiyu Shi, Shaolei, Ren

PDF

1 Repo

TL;DR

This paper proposes a scalable hardware-aware neural architecture search method that uses only one proxy device to efficiently predict latency across diverse devices by exploiting latency monotonicity, reducing the need for device-specific latency predictors.

Contribution

The authors introduce a novel approach leveraging latency monotonicity to reuse architectures across devices, and propose an adaptation technique to improve monotonicity when it is weak.

Findings

01

Using one proxy device yields architectures nearly as optimal as per-device NAS.

02

The approach significantly reduces the cost of latency prediction across multiple devices.

03

Experimental validation on various platforms and search spaces demonstrates effectiveness.

Abstract

Convolutional neural networks (CNNs) are used in numerous real-world applications such as vision-based autonomous driving and video content analysis. To run CNN inference on various target devices, hardware-aware neural architecture search (NAS) is crucial. A key requirement of efficient hardware-aware NAS is the fast evaluation of inference latencies in order to rank different architectures. While building a latency predictor for each target device has been commonly used in state of the art, this is a very time-consuming process, lacking scalability in the presence of extremely diverse devices. In this work, we address the scalability challenge by exploiting latency monotonicity -- the architecture latency rankings on different devices are often correlated. When strong latency monotonicity exists, we can re-use architectures searched for one proxy device on new target devices, without…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Ren-Research/OneProxy
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.