OCTOPINF: Workload-Aware Inference Serving for Edge Video Analytics
Thanh-Tung Nguyen, Lucas Liebe, Nhat-Quang Tau, Yuheng Wu and, Jinghan Cheng, Dongman Lee

TL;DR
OCTOPINF is a workload-aware inference system for edge video analytics that optimizes resource allocation and scheduling to meet real-time requirements, significantly improving throughput and robustness in dynamic edge environments.
Contribution
We introduce OCTOPINF, a novel inference serving system that employs fine-grained resource management and a spatiotemporal scheduling algorithm for edge video analytics.
Findings
Achieves up to 10x throughput increase over baselines.
Demonstrates improved robustness in dynamic edge scenarios.
Effectively balances workloads between edge devices and servers.
Abstract
Edge Video Analytics (EVA) has gained significant attention as a major application of pervasive computing, enabling real-time visual processing. EVA pipelines, composed of deep neural networks (DNNs), typically demand efficient inference serving under stringent latency requirements, which is challenging due to the dynamic Edge environments (e.g., workload variability and network instability). Moreover, EVA pipelines also face significant resource contention caused by resource (e.g., GPU) constraints at the Edge. In this paper, we introduce OCTOPINF, a novel resource-efficient and workload-aware inference serving system designed for real-time EVA. OCTOPINF tackles the unique challenges of dynamic edge environments through fine-grained resource allocation, adaptive batching, and workload balancing between edge devices and servers. Furthermore, we propose a spatiotemporal scheduling…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBrain Tumor Detection and Classification · Medical Imaging Techniques and Applications · IoT and Edge/Fog Computing
