Cappuccino: Efficient Inference Software Synthesis for Mobile   System-on-Chips

Mohammad Motamedi; Daniel Fong; and Soheil Ghiasi

arXiv:1707.02647·cs.DC·July 11, 2017·1 cites

Cappuccino: Efficient Inference Software Synthesis for Mobile System-on-Chips

Mohammad Motamedi, Daniel Fong, and Soheil Ghiasi

PDF

Open Access

TL;DR

Cappuccino is a framework that synthesizes efficient CNN inference software for mobile SoCs, enabling resource-constrained devices to perform complex neural network tasks effectively by optimizing parallelization and exploring tradeoffs.

Contribution

We introduce a novel synthesis framework for CNN inference software tailored to mobile SoCs, focusing on efficient parallelization and tradeoff analysis.

Findings

01

Significant performance improvements on mobile devices

02

Effective parallelization strategies for CNN inference

03

Tradeoff exploration enhances efficiency

Abstract

Convolutional Neural Networks (CNNs) exhibit remarkable performance in various machine learning tasks. As sensor-equipped Internet of Things (IoT) devices permeate into every aspect of modern life, the ability to execute CNN inference, a computationally intensive application, on resource constrained devices has become increasingly important. In this context, we present Cappuccino, a framework for synthesis of efficient inference software targeting mobile System-on-Chips (SoCs). We propose techniques for efficient parallelization of CNN inference targeting mobile SoCs, and explore the underlying tradeoffs. Experiments with different CNNs on three mobile devices demonstrate the effectiveness of our approach.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · CCD and CMOS Imaging Sensors · Parallel Computing and Optimization Techniques