Searching for Efficient Neural Architectures for On-Device ML on Edge   TPUs

Berkin Akin; Suyog Gupta; Yun Long; Anton Spiridonov; Zhuo Wang; Marie; White; Hao Xu; Ping Zhou; Yanqi Zhou

arXiv:2204.14007·cs.DC·May 2, 2022

Searching for Efficient Neural Architectures for On-Device ML on Edge TPUs

Berkin Akin, Suyog Gupta, Yun Long, Anton Spiridonov, Zhuo Wang, Marie, White, Hao Xu, Ping Zhou, Yanqi Zhou

PDF

Open Access

TL;DR

This paper introduces a scalable NAS framework with flexible search spaces tailored for on-device ML accelerators, achieving improved performance-quality trade-offs on Google Tensor SoC across multiple tasks.

Contribution

It presents a decoupled NAS infrastructure and novel group convolution IBN search spaces optimized for diverse on-device ML tasks and platforms.

Findings

01

Achieved better quality-performance trade-offs on Google Tensor SoC

02

Demonstrated improvements across vision and NLP tasks

03

Developed scalable NAS approach for multiple tasks and platforms

Abstract

On-device ML accelerators are becoming a standard in modern mobile system-on-chips (SoC). Neural architecture search (NAS) comes to the rescue for efficiently utilizing the high compute throughput offered by these accelerators. However, existing NAS frameworks have several practical limitations in scaling to multiple tasks and different target platforms. In this work, we provide a two-pronged approach to this challenge: (i) a NAS-enabling infrastructure that decouples model cost evaluation, search space design, and the NAS algorithm to rapidly target various on-device ML tasks, and (ii) search spaces crafted from group convolution based inverted bottleneck (IBN) variants that provide flexible quality/performance trade-offs on ML accelerators, complementing the existing full and depthwise convolution based IBNs. Using this approach we target a state-of-the-art mobile platform, Google…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Ferroelectric and Negative Capacitance Devices · Stochastic Gradient Optimization Techniques

MethodsConvolution · Depthwise Convolution