Cappuccino: Efficient Inference Software Synthesis for Mobile System-on-Chips
Mohammad Motamedi, Daniel Fong, and Soheil Ghiasi

TL;DR
Cappuccino is a framework that synthesizes efficient CNN inference software for mobile SoCs, enabling resource-constrained devices to perform complex neural network tasks effectively by optimizing parallelization and exploring tradeoffs.
Contribution
We introduce a novel synthesis framework for CNN inference software tailored to mobile SoCs, focusing on efficient parallelization and tradeoff analysis.
Findings
Significant performance improvements on mobile devices
Effective parallelization strategies for CNN inference
Tradeoff exploration enhances efficiency
Abstract
Convolutional Neural Networks (CNNs) exhibit remarkable performance in various machine learning tasks. As sensor-equipped Internet of Things (IoT) devices permeate into every aspect of modern life, the ability to execute CNN inference, a computationally intensive application, on resource constrained devices has become increasingly important. In this context, we present Cappuccino, a framework for synthesis of efficient inference software targeting mobile System-on-Chips (SoCs). We propose techniques for efficient parallelization of CNN inference targeting mobile SoCs, and explore the underlying tradeoffs. Experiments with different CNNs on three mobile devices demonstrate the effectiveness of our approach.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · CCD and CMOS Imaging Sensors · Parallel Computing and Optimization Techniques
