A Theory of Auto-Scaling for Resource Reservation in Cloud Services

Konstantinos Psychas; Javad Ghaderi

arXiv:2005.13744·cs.DC·May 29, 2020

A Theory of Auto-Scaling for Resource Reservation in Cloud Services

Konstantinos Psychas, Javad Ghaderi

PDF

Open Access

TL;DR

This paper develops a scalable resource reservation policy for large cloud server systems that maximizes expected rewards by efficiently managing job admission and scheduling without prior traffic knowledge.

Contribution

It introduces an asymptotically optimal resource reservation policy that adapts to demand and achieves near-optimal rewards in large-scale cloud environments.

Findings

01

Policy achieves at least 1/2 of optimal reward asymptotically.

02

Under certain conditions, policy achieves at least 1-1/e of optimal reward.

03

Automatically scales VM slots and efficiently manages high-priority jobs.

Abstract

We consider a distributed server system consisting of a large number of servers, each with limited capacity on multiple resources (CPU, memory, disk, etc.). Jobs with different rewards arrive over time and require certain amounts of resources for the duration of their service. When a job arrives, the system must decide whether to admit it or reject it, and if admitted, in which server to schedule the job. The objective is to maximize the expected total reward received by the system. This problem is motivated by control of cloud computing clusters, in which, jobs are requests for Virtual Machines or Containers that reserve resources for various services, and rewards represent service priority of requests or price paid per time unit of service by clients. We study this problem in an asymptotic regime where the number of servers and jobs' arrival rates scale by a factor $L$ , as $L$ becomes…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Queuing Theory Analysis · Cloud Computing and Resource Management · Scheduling and Optimization Algorithms