Optimal Model Placement and Online Model Splitting for Device-Edge   Co-Inference

Jia Yan; Suzhi Bi; Ying-Jun Angela Zhang

arXiv:2105.13618·cs.LG·May 31, 2021

Optimal Model Placement and Online Model Splitting for Device-Edge Co-Inference

Jia Yan, Suzhi Bi, Ying-Jun Angela Zhang

PDF

Open Access

TL;DR

This paper proposes a joint optimization framework for model placement and online model splitting in device-edge co-inference, aiming to minimize energy and latency costs under wireless channel fading.

Contribution

It introduces a novel joint optimization approach for model placement and splitting decisions, including an optimal stopping formulation and analytical solutions for specific DNN structures.

Findings

01

Joint optimization reduces energy and latency costs.

02

Analytical model splitting rules improve decision efficiency.

03

Simulation confirms performance gains over baseline methods.

Abstract

Device-edge co-inference opens up new possibilities for resource-constrained wireless devices (WDs) to execute deep neural network (DNN)-based applications with heavy computation workloads. In particular, the WD executes the first few layers of the DNN and sends the intermediate features to the edge server that processes the remaining layers of the DNN. By adapting the model splitting decision, there exists a tradeoff between local computation cost and communication overhead. In practice, the DNN model is re-trained and updated periodically at the edge server. Once the DNN parameters are regenerated, part of the updated model must be placed at the WD to facilitate on-device inference. In this paper, we study the joint optimization of the model placement and online model splitting decisions to minimize the energy-and-time cost of device-edge co-inference in presence of wireless channel…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAge of Information Optimization · Machine Learning and ELM · Advanced Memory and Neural Computing