gpu_ext: Extensible OS Policies for GPUs via eBPF
Yusheng Zheng, Tong Yu, Yiwei Yang, Minghui Jiang, Xiangyu Gao, Jianchang Su, Yanpeng Hu, Wenan Mao, Wei Zhang, Dan Williams, Andi Quinn

TL;DR
gpu_ext introduces an eBPF-based framework that enables programmable, safe, and extensible OS policies for GPUs, improving performance and latency across diverse workloads without requiring application modifications.
Contribution
It presents a novel eBPF-based runtime for GPUs that allows safe, flexible policy enforcement directly within GPU drivers and kernels, addressing limitations of existing approaches.
Findings
Up to 4.8x throughput improvement
Up to 2x tail latency reduction
Low overhead with no application modifications
Abstract
Performance in modern GPU-centric systems increasingly depends on resource management policies, including memory placement, scheduling, and observability. However, uniform policies typically yield suboptimal performance across diverse workloads. Existing approaches present a tradeoff: user-space runtimes provide programmability and flexibility but lack cross-tenant visibility and fine-grained control of hardware resources; meanwhile, modifications to the OS kernel introduce significant complexity and safety risks. To address this, we argue that the GPU driver and device layer should provide an extensible OS interface for policy enforcement. While the emerging eBPF technology shows potential, directly applying existing host-side eBPF is insufficient because they lack visibility and control into critical device-side events, and directly embedding policy code into GPU kernels could…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Security and Verification in Computing · Advanced Data Storage Technologies
