Multiverse: Dynamic VM Provisioning for Virtualized High Performance Computing Clusters
Jashwant Raj Gunasekaran, Michael Cui, Prashanth Thinakaran, Josh, Simons, Mahmut Taylan Kandemir, Chita R. Das

TL;DR
This paper introduces Multiverse, a dynamic VM provisioning framework for virtualized HPC clusters that integrates scheduling and resource management to improve utilization and throughput.
Contribution
It presents a novel framework that dynamically spawns VMs by integrating HPC scheduler with VM manager, using instant cloning to reduce overheads.
Findings
Instant cloning is 2.5x faster than full cloning.
Resource utilization improves by up to 40%.
Cluster throughput increases by up to 1.5x.
Abstract
Traditionally, HPC workloads have been deployed in bare-metal clusters; but the advances in virtualization have led the pathway for these workloads to be deployed in virtualized clusters. However, HPC cluster administrators/providers still face challenges in terms of resource elasticity and virtual machine (VM) provisioning at large-scale, due to the lack of coordination between a traditional HPC scheduler and the VM hypervisor (resource management layer). This lack of interaction leads to low cluster utilization and job completion throughput. Furthermore, the VM provisioning delays directly impact the overall performance of jobs in the cluster. Hence, there is a need for effectively provisioning virtualized HPC clusters, which can best-utilize the physical hardware with minimal provisioning overheads. Towards this, we propose Multiverse, a VM provisioning framework, which can…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
