Edge-PRUNE: Flexible Distributed Deep Learning Inference
Jani Boutellier, Bo Tan, Jari Nurmi

TL;DR
Edge-PRUNE is a flexible, lightweight framework for distributed deep learning inference that enhances privacy, reduces latency, and accelerates performance on heterogeneous devices using a formal dataflow model.
Contribution
It introduces a novel dataflow-based framework for distributed inference that is device-agnostic and supports deep learning accelerators, improving performance and privacy.
Findings
Inference time for object tracking accelerated 5.8x
Supports heterogeneous devices and accelerators
Demonstrated on image classification and object tracking
Abstract
Collaborative deep learning inference between low-resource endpoint devices and edge servers has received significant research interest in the last few years. Such computation partitioning can help reducing endpoint device energy consumption and improve latency, but equally importantly also contributes to privacy-preserving of sensitive data. This paper describes Edge-PRUNE, a flexible but light-weight computation framework for distributing machine learning inference between edge servers and one or more client devices. Compared to previous approaches, Edge-PRUNE is based on a formal dataflow computing model, and is agnostic towards machine learning training frameworks, offering at the same time wide support for leveraging deep learning accelerators such as embedded GPUs. The experimental section of the paper demonstrates the use and performance of Edge-PRUNE by image classification and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIoT and Edge/Fog Computing · Privacy-Preserving Technologies in Data · Age of Information Optimization
