Cloud-scale VM Deflation for Running Interactive Applications On Transient Servers
Alexander Fuerst, Ahmed Ali-Eldin, Prashant Shenoy, Prateek Sharma

TL;DR
This paper proposes VM deflation as a resource reclamation method for transient cloud servers, enabling interactive applications to run efficiently with minimal performance impact and increased cloud revenue.
Contribution
It introduces VM deflation as an alternative to preemption, demonstrating its feasibility and effectiveness for supporting interactive applications in transient cloud environments.
Findings
VM deflation can reduce VM size by up to 50% with negligible overhead.
Cluster deflation policies support overcommitment levels up to 50%.
Implementing deflation can increase cloud revenue by 30%.
Abstract
Transient computing has become popular in public cloud environments for running delay-insensitive batch and data processing applications at low cost. Since transient cloud servers can be revoked at any time by the cloud provider, they are considered unsuitable for running interactive application such as web services. In this paper, we present VM deflation as an alternative mechanism to server preemption for reclaiming resources from transient cloud servers under resource pressure. Using real traces from top-tier cloud providers, we show the feasibility of using VM deflation as a resource reclamation mechanism for interactive applications in public clouds. We show how current hypervisor mechanisms can be used to implement VM deflation and present cluster deflation policies for resource management of transient and on-demand cloud VMs. Experimental evaluation of our deflation system on a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
