Neural Network Inference on Mobile SoCs

Siqi Wang; Anuj Pathania; Tulika Mitra

arXiv:1908.11450·cs.LG·February 3, 2021

Neural Network Inference on Mobile SoCs

Siqi Wang, Anuj Pathania, Tulika Mitra

PDF

TL;DR

This paper evaluates the inference capabilities and power-performance trade-offs of various components in mobile SoCs, demonstrating that parallel engagement of all components can double inference performance.

Contribution

It provides a comprehensive quantitative analysis of different ML components in mobile SoCs and explores the potential of concurrent multi-component inference for performance gains.

Findings

01

Up to 2x inference speedup with all components engaged

02

Different components exhibit distinct power-performance profiles

03

Parallel inference leverages heterogeneous SoC resources effectively

Abstract

The ever-increasing demand from mobile Machine Learning (ML) applications calls for evermore powerful on-chip computing resources. Mobile devices are empowered with heterogeneous multi-processor Systems-on-Chips (SoCs) to process ML workloads such as Convolutional Neural Network (CNN) inference. Mobile SoCs house several different types of ML capable components on-die, such as CPU, GPU, and accelerators. These different components are capable of independently performing inference but with very different power-performance characteristics. In this article, we provide a quantitative evaluation of the inference capabilities of the different components on mobile SoCs. We also present insights behind their respective power-performance behavior. Finally, we explore the performance limit of the mobile SoCs by synergistically engaging all the components concurrently. We observe that a mobile SoC…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.