Optimal Model Placement and Online Model Splitting for Device-Edge Co-Inference
Jia Yan, Suzhi Bi, Ying-Jun Angela Zhang

TL;DR
This paper proposes a joint optimization framework for model placement and online model splitting in device-edge co-inference, aiming to minimize energy and latency costs under wireless channel fading.
Contribution
It introduces a novel joint optimization approach for model placement and splitting decisions, including an optimal stopping formulation and analytical solutions for specific DNN structures.
Findings
Joint optimization reduces energy and latency costs.
Analytical model splitting rules improve decision efficiency.
Simulation confirms performance gains over baseline methods.
Abstract
Device-edge co-inference opens up new possibilities for resource-constrained wireless devices (WDs) to execute deep neural network (DNN)-based applications with heavy computation workloads. In particular, the WD executes the first few layers of the DNN and sends the intermediate features to the edge server that processes the remaining layers of the DNN. By adapting the model splitting decision, there exists a tradeoff between local computation cost and communication overhead. In practice, the DNN model is re-trained and updated periodically at the edge server. Once the DNN parameters are regenerated, part of the updated model must be placed at the WD to facilitate on-device inference. In this paper, we study the joint optimization of the model placement and online model splitting decisions to minimize the energy-and-time cost of device-edge co-inference in presence of wireless channel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAge of Information Optimization · Machine Learning and ELM · Advanced Memory and Neural Computing
