An Online Fragmentation-Aware GPU Scheduler for Multi-Tenant MIG-based Clouds
Marco Zambianco, Lorenzo Fasol, Roberto Doriguzzi-Corin

TL;DR
This paper presents an online, fragmentation-aware GPU scheduler for MIG-based cloud environments that improves workload acceptance rates by intelligently managing GPU resource fragmentation.
Contribution
The paper introduces a novel fragmentation metric and a greedy scheduling algorithm tailored for MIG-based GPUs to reduce fragmentation and increase workload acceptance.
Findings
Achieves 10% higher workload acceptance in heavy load conditions
Maintains similar GPU utilization as baseline methods
Effectively reduces GPU fragmentation through the proposed scheduling approach
Abstract
The explosive growth of AI applications has created unprecedented demand for GPU resources. Cloud providers meet this demand through GPU-as-a-Service platforms that offer rentable GPU resources for running AI workloads. In this context, the sharing of GPU resources between different tenants is essential to maximize the number of scheduled workloads. Among the various GPU sharing technologies, NVIDIA's Multi-Instance GPU (MIG) stands out by partitioning GPUs at hardware level into isolated slices with dedicated compute and memory, ensuring strong tenant isolation, preventing resource contention, and enhancing security. Despite these advantages, MIG's fixed partitioning introduces scheduling rigidity, leading to severe GPU fragmentation in multi-tenant environments, where workloads are continuously deployed and terminated. Fragmentation leaves GPUs underutilized, limiting the number of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · Big Data and Digital Economy · Parallel Computing and Optimization Techniques
