Online Resource Allocation for Edge Intelligence with Colocated Model   Retraining and Inference

Huaiguang Cai; Zhi Zhou; Qianyi Huang

arXiv:2405.16029·cs.LG·May 28, 2024

Online Resource Allocation for Edge Intelligence with Colocated Model Retraining and Inference

Huaiguang Cai, Zhi Zhou, Qianyi Huang

PDF

Open Access 1 Repo

TL;DR

This paper presents ORRIC, an online algorithm for resource allocation that balances model retraining and inference on edge servers, improving accuracy amid data drift and resource constraints.

Contribution

Introduces ORRIC, a lightweight, explainable online approximation algorithm for resource allocation in colocated model retraining and inference at the edge.

Findings

01

ORRIC outperforms traditional inference-only methods in accuracy.

02

The algorithm adapts well to persistent data drift scenarios.

03

Experimental validation confirms effectiveness in real edge environments.

Abstract

With edge intelligence, AI models are increasingly pushed to the edge to serve ubiquitous users. However, due to the drift of model, data, and task, AI model deployed at the edge suffers from degraded accuracy in the inference serving phase. Model retraining handles such drifts by periodically retraining the model with newly arrived data. When colocating model retraining and model inference serving for the same model on resource-limited edge servers, a fundamental challenge arises in balancing the resource allocation for model retraining and inference, aiming to maximize long-term inference accuracy. This problem is particularly difficult due to the underlying mathematical formulation being time-coupled, non-convex, and NP-hard. To address these challenges, we introduce a lightweight and explainable online approximation algorithm, named ORRIC, designed to optimize resource allocation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

caihuaiguang/ORRIC
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Distributed and Parallel Computing Systems · Stochastic Gradient Optimization Techniques