Hierarchical Resource Partitioning on Modern GPUs: A Reinforcement Learning Approach
Urvij Saroliya, Eishi Arima, Dai Liu, Martin Schulz

TL;DR
This paper presents a reinforcement learning-based method for hierarchical resource partitioning on modern GPUs, significantly improving throughput by efficiently co-scheduling multiple jobs using features like MPS and MIG.
Contribution
It introduces a novel reinforcement learning approach to optimize hierarchical GPU resource partitioning and job co-scheduling, enhancing utilization and performance.
Findings
Maximum throughput improved by 1.87x over time-sharing.
Effective joint optimization of partitioning and scheduling.
Demonstrated success on NVIDIA GPU features MPS and MIG.
Abstract
GPU-based heterogeneous architectures are now commonly used in HPC clusters. Due to their architectural simplicity specialized for data-level parallelism, GPUs can offer much higher computational throughput and memory bandwidth than CPUs in the same generation do. However, as the available resources in GPUs have increased exponentially over the past decades, it has become increasingly difficult for a single program to fully utilize them. As a consequence, the industry has started supporting several resource partitioning features in order to improve the resource utilization by co-scheduling multiple programs on the same GPU die at the same time. Driven by the technological trend, this paper focuses on hierarchical resource partitioning on modern GPUs, and as an example, we utilize a combination of two different features available on recent NVIDIA GPUs in a hierarchical manner: MPS…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSparse Evolutionary Training
