LAVA: Lifetime-Aware VM Allocation with Learned Distributions and Adaptation to Mispredictions
Jianheng Ling, Pratik Worah, Yawen Wang, Yunchuan Kong, Anshul Kapoor, Chunlei Wang, Clifford Stein, Diwakar Gupta, Jason Behmer, Logan A. Bush, Prakash Ramanan, Rajesh Kumar, Thomas Chestna, Yajing Liu, Ying Liu, Ye Zhao, Kathryn S. McKinley, Meeyoung Park, Martin Maas

TL;DR
LAVA enhances VM scheduling efficiency in cloud data centers by using lifetime distributions and repredictions, reducing resource wastage, energy consumption, and VM migrations, with proven benefits in Google's production environment.
Contribution
Introduces LAVA, a novel lifetime-aware VM allocation method combining distribution-based predictions and adaptive scheduling, improving efficiency and reliability over prior one-shot prediction approaches.
Findings
Reduces resource strandedness by ~3% and memory by ~2%.
Increases number of empty hosts by 2.3-9.2 percentage points.
Decreases VM migrations for defragmentation and maintenance.
Abstract
Scheduling virtual machines (VMs) on hosts in cloud data centers dictates efficiency and is an NP-hard problem with incomplete information. Prior work improved VM scheduling with predicted VM lifetimes. Our work further improves lifetime-aware scheduling using repredictions with lifetime distributions versus one-shot prediction. Our approach repredicts and adjusts VM and host lifetimes when incorrect predictions emerge. We also present novel approaches for defragmentation and regular system maintenance, which are essential to our data center reliability and optimizations, and are not explored in prior work. We show repredictions deliver a fundamental advance in effectiveness over one-shot prediction. We call our novel combination of distribution-based lifetime predictions and scheduling algorithms Lifetime Aware VM Allocation (LAVA). LAVA reduces resource stranding and increases the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Stream Mining Techniques · Time Series Analysis and Forecasting · Distributed and Parallel Computing Systems
