DeepRT: A Soft Real Time Scheduler for Computer Vision Applications on the Edge
Zhe Yang, Klara Nahrstedt, Hongpeng Guo, Qian Zhou

TL;DR
DeepRT is a GPU scheduler designed for edge computing that ensures soft real-time performance for computer vision applications by batching requests and managing overruns, improving latency guarantees and throughput.
Contribution
We introduce DeepRT, a novel GPU scheduler with DisBatcher and Adaptation modules that provide latency guarantees and high throughput for multi-tenant edge AI workloads.
Findings
DeepRT reduces deadline misses compared to existing schedulers.
DeepRT maintains high throughput while providing latency guarantees.
DeepRT outperforms state-of-the-art in both latency and throughput metrics.
Abstract
The ubiquity of smartphone cameras and IoT cameras, together with the recent boom of deep learning and deep neural networks, proliferate various computer vision driven mobile and IoT applications deployed on the edge. This paper focuses on applications which make soft real time requests to perform inference on their data - they desire prompt responses within designated deadlines, but occasional deadline misses are acceptable. Supporting soft real time applications on a multi-tenant edge server is not easy, since the requests sharing the limited GPU computing resources of an edge server interfere with each other. In order to tackle this problem, we comprehensively evaluate how latency and throughput respond to different GPU execution plans. Based on this analysis, we propose a GPU scheduler, DeepRT, which provides latency guarantee to the requests while maintaining high overall system…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAge of Information Optimization · IoT and Edge/Fog Computing · Advanced Neural Network Applications
