SPINN: Synergistic Progressive Inference of Neural Networks over Device   and Cloud

Stefanos Laskaridis; Stylianos I. Venieris; Mario Almeida; Ilias; Leontiadis; Nicholas D. Lane

arXiv:2008.06402·cs.LG·August 25, 2020

SPINN: Synergistic Progressive Inference of Neural Networks over Device and Cloud

Stefanos Laskaridis, Stylianos I. Venieris, Mario Almeida, Ilias, Leontiadis, Nicholas D. Lane

PDF

TL;DR

SPINN is a distributed device-cloud inference system for CNNs that adapts dynamically to network conditions, improving throughput, reducing costs, and maintaining accuracy in mobile and mission-critical applications.

Contribution

It introduces a novel scheduler that co-optimizes early-exit policies and CNN splitting at runtime for robust, efficient inference across diverse conditions.

Findings

01

Up to 2x throughput improvement over state-of-the-art methods

02

Reduces server costs by up to 6.8x

03

Improves accuracy by 20.7% under latency constraints

Abstract

Despite the soaring use of convolutional neural networks (CNNs) in mobile applications, uniformly sustaining high-performance inference on mobile has been elusive due to the excessive computational demands of modern CNNs and the increasing diversity of deployed devices. A popular alternative comprises offloading CNN processing to powerful cloud-based servers. Nevertheless, by relying on the cloud to produce outputs, emerging mission-critical and high-mobility applications, such as drone obstacle avoidance or interactive applications, can suffer from the dynamic connectivity conditions and the uncertain availability of the cloud. In this paper, we propose SPINN, a distributed inference system that employs synergistic device-cloud computation together with a progressive inference method to deliver fast and robust CNN inference across diverse settings. The proposed system introduces a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.