Autodidactic Neurosurgeon: Collaborative Deep Inference for Mobile Edge Intelligence via Online Learning
Letian Zhang, Lixing Chen, Jie Xu

TL;DR
This paper introduces Autodidactic Neurosurgeon (ANS), an online learning system that dynamically optimizes DNN partitioning between mobile devices and edge servers, significantly reducing inference delay in resource-constrained environments.
Contribution
It proposes a novel online learning algorithm, $mu$LinUCB, for adaptive DNN partitioning, enabling real-time system optimization without offline profiling.
Findings
ANS outperforms benchmarks in reducing inference delay
The $mu$LinUCB algorithm has provable performance guarantees
System adapts effectively to changing environments
Abstract
Recent breakthroughs in deep learning (DL) have led to the emergence of many intelligent mobile applications and services, but in the meanwhile also pose unprecedented computing challenges on resource-constrained mobile devices. This paper builds a collaborative deep inference system between a resource-constrained mobile device and a powerful edge server, aiming at joining the power of both on-device processing and computation offloading. The basic idea of this system is to partition a deep neural network (DNN) into a front-end part running on the mobile device and a back-end part running on the edge server, with the key challenge being how to locate the optimal partition point to minimize the end-to-end inference delay. Unlike existing efforts on DNN partitioning that rely heavily on a dedicated offline profiling stage to search for the optimal partition point, our system has a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIoT and Edge/Fog Computing · Age of Information Optimization · Molecular Communication and Nanonetworks
