Pinpoint resource allocation for GPU batch applications
Tim Voigtl\"ander, Manuel Giffels, G\"unter Quast, Matthias Schnepf, Roger Wolf

TL;DR
This paper investigates resource allocation strategies for GPU batch applications in high energy physics, focusing on optimizing throughput and energy efficiency for low-intensity GPU workloads using NVIDIA's MPS and batch system integration.
Contribution
It introduces a flexible resource allocation approach combining NVIDIA's MPS with batch systems, improving efficiency for diverse GPU workloads in HEP.
Findings
NVIDIA's MPS enhances GPU resource utilization.
The approach improves throughput for low-intensity workloads.
Energy efficiency is increased with optimized resource sharing.
Abstract
With the increasing usage of Machine Learning (ML) in High energy physics (HEP), there is a variety of new analyses with a large spread in compute resource requirements, especially when it comes to GPU resources. For institutes, like the Karlsruhe Institute of Technology (KIT), that provide GPU compute resources to HEP via their batch systems or the Grid, a high throughput, as well as energy efficient usage of their systems is essential. With low intensity GPU analyses specifically, inefficiencies are created by the standard scheduling, as resources are over-assigned to such workflows. An approach that is flexible enough to cover the entire spectrum, from multi-process per GPU, to multi-GPU per process, is necessary. As a follow-up to the techniques presented at ACAT 2022, this time we study NVIDIA's Multi-Process Service (MPS), its ability to securely distribute device memory and its…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
