Differentiate Quality of Experience Scheduling for Deep Learning Inferences with Docker Containers in the Cloud
Ying Mao, Weifeng Yan, Yun Song, Yue Zeng, Ming Chen, Long Cheng, and, Qingzhi Liu

TL;DR
This paper introduces DQoES, a scheduler that dynamically allocates cloud resources for deep learning inference tasks based on specified QoE targets, significantly improving satisfaction levels.
Contribution
The paper presents a novel QoE-aware scheduling system for cloud-based deep learning inferences, enabling tailored resource management according to client requirements.
Findings
DQoES achieves up to 8x more satisfied models compared to existing systems.
It effectively manages multiple concurrent jobs with different QoE targets.
Experimental results validate the scheduler's ability to meet diverse QoE specifications.
Abstract
With the prevalence of big-data-driven applications, such as face recognition on smartphones and tailored recommendations from Google Ads, we are on the road to a lifestyle with significantly more intelligence than ever before. Various neural network powered models are running at the back end of their intelligence to enable quick responses to users. Supporting those models requires lots of cloud-based computational resources, e.g., CPUs and GPUs. The cloud providers charge their clients by the amount of resources that they occupy. Clients have to balance the budget and quality of experiences (e.g., response time). The budget leans on individual business owners, and the required Quality of Experience (QoE) depends on usage scenarios of different applications. For instance, an autonomous vehicle requires an real-time response, but unlocking your smartphone can tolerate delays. However,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
