Taming GPU Underutilization via Static Partitioning and Fine-grained CPU Offloading

Gabin Schieffer; Ruimin Shi; Jie Ren; Ivy Peng

arXiv:2604.08451·cs.DC·April 10, 2026

Taming GPU Underutilization via Static Partitioning and Fine-grained CPU Offloading

Gabin Schieffer, Ruimin Shi, Jie Ren, Ivy Peng

PDF

TL;DR

This paper analyzes GPU sharing limitations and proposes a memory-offloading scheme using Nvlink-C2C to improve utilization and reduce underutilization in diverse workloads.

Contribution

It provides a system-level characterization of GPU sharing options and introduces a novel memory-offloading scheme to address resource mismatch issues.

Findings

01

GPU sharing via MIG reduces underutilization but still faces interference issues.

02

Coarse-grained provisioning often mismatches application needs.

03

Memory offloading via Nvlink-C2C improves resource utilization.

Abstract

Advances in GPU compute throughput and memory capacity brings significant opportunities to a wide range of workloads. However, efficiently utilizing these resources remains challenging, particularly because diverse application characteristics may result in imbalanced utilization. Multi-Instance GPU (MIG) is a promising approach to improve utilization by partitioning GPU compute and memory resources into fixed-size slices with isolation. Yet, its effectiveness and limitations in supporting HPC workloads remain an open question. We present a comprehensive system-level characterization of different GPU sharing options using real-world scientific, AI, and data analytics applications, including NekRS, LAMMPS, Llama3, and Qiskit. Our analysis reveals that while GPU sharing via MIG can significantly reduce resource underutilization, and enable system-level improvements in throughput and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.