GCAPS: GPU Context-Aware Preemptive Priority-based Scheduling for Real-Time Tasks
Yidi Wang, Cong Liu, Daniel Wong, Hyoseung Kim

TL;DR
This paper introduces GCAPS, a GPU context-aware preemptive scheduling method for real-time tasks that improves schedulability and response times by controlling GPU context scheduling at the driver level, with minimal code modifications.
Contribution
The paper presents a novel GPU preemption approach that operates at the device driver level, enabling priority-based preemption with simple code changes and providing comprehensive response time analysis.
Findings
Up to 40% higher taskset schedulability.
Significant response time improvements over default Nvidia scheduling.
Predictable worst-case behavior on embedded platforms.
Abstract
Scheduling real-time tasks that utilize GPUs with analyzable guarantees poses a significant challenge due to the intricate interaction between CPU and GPU resources, as well as the complex GPU hardware and software stack. While much research has been conducted in the real-time research community, several limitations persist, including the absence or limited availability of GPU-level preemption, extended blocking times, and/or the need for extensive modifications to program code. In this paper, we propose GCAPS, a GPU Context-Aware Preemptive Scheduling approach for real-time GPU tasks. Our approach exerts control over GPU context scheduling at the device driver level and enables preemption of GPU execution based on task priorities by simply adding one-line macros to GPU segment boundaries. In addition, we provide a comprehensive response time analysis of GPU-using tasks for both our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReal-Time Systems Scheduling · Distributed and Parallel Computing Systems · Parallel Computing and Optimization Techniques
