Scheduling Real-time Deep Learning Services as Imprecise Computations
Shuochao Yao, Yifan Hao, Yiran Zhao, Huajie Shao, Dongxin Liu,, Shengzhong Liu, Tianshi Wang, Jinyang Li, Tarek Abdelzaher

TL;DR
This paper introduces a real-time scheduling algorithm for deep learning tasks on edge devices, optimizing accuracy by selectively executing optional neural network parts within deadline constraints.
Contribution
It proposes a novel scheduling approach that treats neural network workflows as imprecise computations with mandatory and optional parts, improving accuracy under real-time constraints.
Findings
Increases accuracy by 10-20%
Nearly eliminates deadline misses
Effective on GPU hardware for vision tasks
Abstract
The paper presents an efficient real-time scheduling algorithm for intelligent real-time edge services, defined as those that perform machine intelligence tasks, such as voice recognition, LIDAR processing, or machine vision, on behalf of local embedded devices that are themselves unable to support extensive computations. The work contributes to a recent direction in real-time computing that develops scheduling algorithms for machine intelligence tasks with anytime prediction. We show that deep neural network workflows can be cast as imprecise computations, each with a mandatory part and (several) optional parts whose execution utility depends on input data. The goal of the real-time scheduler is to maximize the average accuracy of deep neural network outputs while meeting task deadlines, thanks to opportunistic shedding of the least necessary optional parts. The work is motivated by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReal-Time Systems Scheduling · Advanced Neural Network Applications · Age of Information Optimization
