A Theory of Auto-Scaling for Resource Reservation in Cloud Services
Konstantinos Psychas, Javad Ghaderi

TL;DR
This paper develops a scalable resource reservation policy for large cloud server systems that maximizes expected rewards by efficiently managing job admission and scheduling without prior traffic knowledge.
Contribution
It introduces an asymptotically optimal resource reservation policy that adapts to demand and achieves near-optimal rewards in large-scale cloud environments.
Findings
Policy achieves at least 1/2 of optimal reward asymptotically.
Under certain conditions, policy achieves at least 1-1/e of optimal reward.
Automatically scales VM slots and efficiently manages high-priority jobs.
Abstract
We consider a distributed server system consisting of a large number of servers, each with limited capacity on multiple resources (CPU, memory, disk, etc.). Jobs with different rewards arrive over time and require certain amounts of resources for the duration of their service. When a job arrives, the system must decide whether to admit it or reject it, and if admitted, in which server to schedule the job. The objective is to maximize the expected total reward received by the system. This problem is motivated by control of cloud computing clusters, in which, jobs are requests for Virtual Machines or Containers that reserve resources for various services, and rewards represent service priority of requests or price paid per time unit of service by clients. We study this problem in an asymptotic regime where the number of servers and jobs' arrival rates scale by a factor , as becomes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Queuing Theory Analysis · Cloud Computing and Resource Management · Scheduling and Optimization Algorithms
