Privacy-Aware Joint DNN Model Deployment and Partitioning Optimization for Collaborative Edge Inference Services
Zhipeng Cheng, Xiaoyu Xia, Hong Wang, Minghui Liwang, Ning Chen, Xuwei Fan, and Xianbin Wang

TL;DR
This paper introduces a privacy-aware optimization framework for deploying and partitioning DNN models on edge devices, aiming to reduce inference delay while ensuring privacy and resource constraints.
Contribution
It proposes a novel joint optimization approach using Lyapunov methods and coalition games to improve edge inference performance under privacy and resource limitations.
Findings
Significantly reduces inference delay in edge scenarios
Effectively balances privacy constraints with system performance
Outperforms existing baselines across various simulations
Abstract
Edge inference (EI) has emerged as a promising paradigm to address the growing limitations of cloud-based Deep Neural Network (DNN) inference services, such as high response latency, limited scalability, and severe data privacy exposure. However, deploying DNN models on resource-constrained edge devices introduces additional challenges, including limited computation/storage resources, dynamic service demands, and heightened privacy risks. To tackle these issues, this paper presents a novel privacy-aware optimization framework that jointly addresses DNN model deployment, user-server association, and model partitioning, with the goal of minimizing long-term average inference delay under resource and privacy constraints. The problem is formulated as a complex, NP-hard stochastic optimization. To efficiently handle system dynamics and computational complexity, we employ a Lyapunov-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Methodstravel james
