Online Learning for Orchestration of Inference in Multi-User End-Edge-Cloud Networks
Sina Shahhosseini, Dongjoo Seo, Anil Kanduri, Tianyi Hu, Sung-soo Lim,, Bryan Donyanavard, Amir M.Rahmani, Nikil Dutt

TL;DR
This paper introduces a reinforcement learning approach for dynamic computation offloading and model selection in end-edge-cloud networks, optimizing latency and accuracy for deep learning inference in resource-constrained environments.
Contribution
It presents an online learning framework that jointly optimizes offloading policies and model choices, improving response times with minimal accuracy loss.
Findings
35% speedup in response time over state-of-the-art methods
Less than 0.9% reduction in model accuracy
Effective in real-setup multi-cloud and edge configurations
Abstract
Deep-learning-based intelligent services have become prevalent in cyber-physical applications including smart cities and health-care. Deploying deep-learning-based intelligence near the end-user enhances privacy protection, responsiveness, and reliability. Resource-constrained end-devices must be carefully managed in order to meet the latency and energy requirements of computationally-intensive deep learning services. Collaborative end-edge-cloud computing for deep learning provides a range of performance and efficiency that can address application requirements through computation offloading. The decision to offload computation is a communication-computation co-optimization problem that varies with both system parameters (e.g., network condition) and workload characteristics (e.g., inputs). On the other hand, deep learning model optimization provides another source of tradeoff between…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIoT and Edge/Fog Computing · Privacy-Preserving Technologies in Data · Age of Information Optimization
