Modeling The Temporally Constrained Preemptions of Transient Cloud VMs

JCS Kadupitiya; Vikram Jadhao; Prateek Sharma

arXiv:1911.05160·cs.DC·June 18, 2020

Modeling The Temporally Constrained Preemptions of Transient Cloud VMs

JCS Kadupitiya, Vikram Jadhao, Prateek Sharma

PDF

1 Repo

TL;DR

This paper investigates the time-dependent preemption patterns of transient cloud VMs, develops a new probabilistic model, and creates resource management policies that significantly improve reliability and reduce costs in scientific computing workloads.

Contribution

It introduces a novel bathtub-shaped preemption probability model for temporally constrained preemptions and demonstrates its effectiveness in optimizing job scheduling and checkpointing policies.

Findings

01

Preemptions are time-dependent with a bathtub shape.

02

Existing memoryless models are inadequate for these preemptions.

03

Model-based policies can halve job failure probability and reduce costs by 5x.

Abstract

Transient cloud servers such as Amazon Spot instances, Google Preemptible VMs, and Azure Low-priority batch VMs, can reduce cloud computing costs by as much as $10 \times$ , but can be unilaterally preempted by the cloud provider. Understanding preemption characteristics (such as frequency) is a key first step in minimizing the effect of preemptions on application performance, availability, and cost. However, little is understood about temporally constrained preemptions---wherein preemptions must occur in a given time window. We study temporally constrained preemptions by conducting a large scale empirical study of Google's Preemptible VMs (that have a maximum lifetime of 24 hours), develop a new preemption probability model, new model-driven resource management policies, and implement them in a batch computing service for scientific computing workloads. Our statistical and experimental…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kadupitiya/goog-preemption-data
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.