Differentiate Quality of Experience Scheduling for Deep Learning   Inferences with Docker Containers in the Cloud

Ying Mao; Weifeng Yan; Yun Song; Yue Zeng; Ming Chen; Long Cheng; and; Qingzhi Liu

arXiv:2010.12728·cs.DC·September 26, 2022

Differentiate Quality of Experience Scheduling for Deep Learning Inferences with Docker Containers in the Cloud

Ying Mao, Weifeng Yan, Yun Song, Yue Zeng, Ming Chen, Long Cheng, and, Qingzhi Liu

PDF

TL;DR

This paper introduces DQoES, a scheduler that dynamically allocates cloud resources for deep learning inference tasks based on specified QoE targets, significantly improving satisfaction levels.

Contribution

The paper presents a novel QoE-aware scheduling system for cloud-based deep learning inferences, enabling tailored resource management according to client requirements.

Findings

01

DQoES achieves up to 8x more satisfied models compared to existing systems.

02

It effectively manages multiple concurrent jobs with different QoE targets.

03

Experimental results validate the scheduler's ability to meet diverse QoE specifications.

Abstract

With the prevalence of big-data-driven applications, such as face recognition on smartphones and tailored recommendations from Google Ads, we are on the road to a lifestyle with significantly more intelligence than ever before. Various neural network powered models are running at the back end of their intelligence to enable quick responses to users. Supporting those models requires lots of cloud-based computational resources, e.g., CPUs and GPUs. The cloud providers charge their clients by the amount of resources that they occupy. Clients have to balance the budget and quality of experiences (e.g., response time). The budget leans on individual business owners, and the required Quality of Experience (QoE) depends on usage scenarios of different applications. For instance, an autonomous vehicle requires an real-time response, but unlocking your smartphone can tolerate delays. However,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.