Edge Intelligence: On-Demand Deep Learning Model Co-Inference with Device-Edge Synergy
En Li, Zhi Zhou, Xu Chen

TL;DR
Edgent is a framework that enables real-time, low-latency deep neural network inference on resource-constrained devices by adaptively partitioning computation between device and edge, and by early-exiting DNNs.
Contribution
This paper introduces Edgent, a novel framework combining adaptive DNN partitioning and right-sizing for efficient edge intelligence.
Findings
Effective DNN partitioning between device and edge reduces latency.
Early-exit DNNs significantly cut computation time.
Prototype on Raspberry Pi shows improved performance.
Abstract
As the backbone technology of machine learning, deep neural networks (DNNs) have have quickly ascended to the spotlight. Running DNNs on resource-constrained mobile devices is, however, by no means trivial, since it incurs high performance and energy overhead. While offloading DNNs to the cloud for execution suffers unpredictable performance, due to the uncontrolled long wide-area network latency. To address these challenges, in this paper, we propose Edgent, a collaborative and on-demand DNN co-inference framework with device-edge synergy. Edgent pursues two design knobs: (1) DNN partitioning that adaptively partitions DNN computation between device and edge, in order to leverage hybrid computation resources in proximity for real-time DNN inference. (2) DNN right-sizing that accelerates DNN inference through early-exit at a proper intermediate DNN layer to further reduce the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAge of Information Optimization · Adversarial Robustness in Machine Learning · Advanced Neural Network Applications
