Online Resource Allocation for Edge Intelligence with Colocated Model Retraining and Inference
Huaiguang Cai, Zhi Zhou, Qianyi Huang

TL;DR
This paper presents ORRIC, an online algorithm for resource allocation that balances model retraining and inference on edge servers, improving accuracy amid data drift and resource constraints.
Contribution
Introduces ORRIC, a lightweight, explainable online approximation algorithm for resource allocation in colocated model retraining and inference at the edge.
Findings
ORRIC outperforms traditional inference-only methods in accuracy.
The algorithm adapts well to persistent data drift scenarios.
Experimental validation confirms effectiveness in real edge environments.
Abstract
With edge intelligence, AI models are increasingly pushed to the edge to serve ubiquitous users. However, due to the drift of model, data, and task, AI model deployed at the edge suffers from degraded accuracy in the inference serving phase. Model retraining handles such drifts by periodically retraining the model with newly arrived data. When colocating model retraining and model inference serving for the same model on resource-limited edge servers, a fundamental challenge arises in balancing the resource allocation for model retraining and inference, aiming to maximize long-term inference accuracy. This problem is particularly difficult due to the underlying mathematical formulation being time-coupled, non-convex, and NP-hard. To address these challenges, we introduce a lightweight and explainable online approximation algorithm, named ORRIC, designed to optimize resource allocation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Distributed and Parallel Computing Systems · Stochastic Gradient Optimization Techniques
