Edge AI: On-Demand Accelerating Deep Neural Network Inference via Edge Computing
En Li, Liekang Zeng, Zhi Zhou, Xu Chen

TL;DR
This paper introduces Edgent, a framework that enhances real-time DNN inference on mobile devices by adaptively partitioning computation between device and edge, and employing early exiting strategies to reduce latency in varying network conditions.
Contribution
The paper proposes a novel edge computing framework with adaptive DNN partitioning and right-sizing, addressing latency issues in mobile AI inference under static and dynamic network environments.
Findings
Edgent achieves significant latency reduction in DNN inference.
The framework effectively adapts to network fluctuations with high accuracy.
Experimental results validate Edgent's practicality on Raspberry Pi and PC.
Abstract
As a key technology of enabling Artificial Intelligence (AI) applications in 5G era, Deep Neural Networks (DNNs) have quickly attracted widespread attention. However, it is challenging to run computation-intensive DNN-based tasks on mobile devices due to the limited computation resources. What's worse, traditional cloud-assisted DNN inference is heavily hindered by the significant wide-area network latency, leading to poor real-time performance as well as low quality of user experience. To address these challenges, in this paper, we propose Edgent, a framework that leverages edge computing for DNN collaborative inference through device-edge synergy. Edgent exploits two design knobs: (1) DNN partitioning that adaptively partitions computation between device and edge for purpose of coordinating the powerful cloud resource and the proximal edge resource for real-time DNN inference; (2) DNN…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIoT and Edge/Fog Computing · Advanced Memory and Neural Computing · Advanced Neural Network Applications
Methodspc · Early exiting using confidence measures
