Edge-MultiAI: Multi-Tenancy of Latency-Sensitive Deep Learning Applications on Edge
SM Zobaed, Ali Mokhtari, Jaya Prakash Champati, Mathieu Kourouma,, Mohsen Amini Salehi

TL;DR
Edge-MultiAI is a framework that enables multiple latency-sensitive deep learning applications to run concurrently on edge servers by efficiently managing neural network models, reducing memory contention, and maximizing multi-tenancy.
Contribution
It introduces a novel model management framework using compression and dynamic loading, along with a Bayesian-based heuristic for request prediction to enhance multi-tenancy on edge servers.
Findings
Stimulates at least 2x multi-tenancy on edge servers.
Increases warm-start inferences by around 60%.
Maintains inference accuracy despite resource constraints.
Abstract
Smart IoT-based systems often desire continuous execution of multiple latency-sensitive Deep Learning (DL) applications. The edge servers serve as the cornerstone of such IoT-based systems, however, their resource limitations hamper the continuous execution of multiple (multi-tenant) DL applications. The challenge is that, DL applications function based on bulky "neural network (NN) models" that cannot be simultaneously maintained in the limited memory space of the edge. Accordingly, the main contribution of this research is to overcome the memory contention challenge, thereby, meeting the latency constraints of the DL applications without compromising their inference accuracy. We propose an efficient NN model management framework, called Edge-MultiAI, that ushers the NN models of the DL applications into the edge memory such that the degree of multi-tenancy and the number of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · IoT and Edge/Fog Computing · Explainable Artificial Intelligence (XAI)
