DynO: Dynamic Onloading of Deep Neural Networks from Cloud to Device

Mario Almeida; Stefanos Laskaridis; Stylianos I. Venieris; Ilias; Leontiadis; Nicholas D. Lane

arXiv:2104.09949·cs.DC·January 12, 2022

DynO: Dynamic Onloading of Deep Neural Networks from Cloud to Device

Mario Almeida, Stefanos Laskaridis, Stylianos I. Venieris, Ilias, Leontiadis, Nicholas D. Lane

PDF

TL;DR

DynO is a novel distributed inference framework that dynamically balances cloud and device resources for CNNs, significantly improving performance and reducing data transfer compared to existing methods.

Contribution

It introduces a CNN-specific data packing method and a dynamic scheduler to optimize inference partitioning and data precision in real-time.

Findings

01

Over 10x throughput improvement over device-only execution

02

Up to 7.9x performance gain over CNN offloading systems

03

Up to 60x reduction in data transferred

Abstract

Recently, there has been an explosive growth of mobile and embedded applications using convolutional neural networks(CNNs). To alleviate their excessive computational demands, developers have traditionally resorted to cloud offloading, inducing high infrastructure costs and a strong dependence on networking conditions. On the other end, the emergence of powerful SoCs is gradually enabling on-device execution. Nonetheless, low- and mid-tier platforms still struggle to run state-of-the-art CNNs sufficiently. In this paper, we present DynO, a distributed inference framework that combines the best of both worlds to address several challenges, such as device heterogeneity, varying bandwidth and multi-objective requirements. Key components that enable this are its novel CNN-specific data packing method, which exploits the variability of precision needs in different parts of the CNN when…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.