Online Learning for Orchestration of Inference in Multi-User   End-Edge-Cloud Networks

Sina Shahhosseini; Dongjoo Seo; Anil Kanduri; Tianyi Hu; Sung-soo Lim,; Bryan Donyanavard; Amir M.Rahmani; Nikil Dutt

arXiv:2202.10541·cs.LG·February 23, 2022

Online Learning for Orchestration of Inference in Multi-User End-Edge-Cloud Networks

Sina Shahhosseini, Dongjoo Seo, Anil Kanduri, Tianyi Hu, Sung-soo Lim,, Bryan Donyanavard, Amir M.Rahmani, Nikil Dutt

PDF

Open Access

TL;DR

This paper introduces a reinforcement learning approach for dynamic computation offloading and model selection in end-edge-cloud networks, optimizing latency and accuracy for deep learning inference in resource-constrained environments.

Contribution

It presents an online learning framework that jointly optimizes offloading policies and model choices, improving response times with minimal accuracy loss.

Findings

01

35% speedup in response time over state-of-the-art methods

02

Less than 0.9% reduction in model accuracy

03

Effective in real-setup multi-cloud and edge configurations

Abstract

Deep-learning-based intelligent services have become prevalent in cyber-physical applications including smart cities and health-care. Deploying deep-learning-based intelligence near the end-user enhances privacy protection, responsiveness, and reliability. Resource-constrained end-devices must be carefully managed in order to meet the latency and energy requirements of computationally-intensive deep learning services. Collaborative end-edge-cloud computing for deep learning provides a range of performance and efficiency that can address application requirements through computation offloading. The decision to offload computation is a communication-computation co-optimization problem that varies with both system parameters (e.g., network condition) and workload characteristics (e.g., inputs). On the other hand, deep learning model optimization provides another source of tradeoff between…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIoT and Edge/Fog Computing · Privacy-Preserving Technologies in Data · Age of Information Optimization