Preemptive Thread Block Scheduling with Online Structural Runtime Prediction for Concurrent GPGPU Kernels
Sreepathi Pai, R. Govindarajan, Matthew J. Thazhuthaveetil

TL;DR
This paper introduces a preemptive scheduling policy for concurrent GPU kernels that uses online runtime prediction to improve performance and fairness over traditional FIFO scheduling.
Contribution
It proposes a novel Structural Runtime Predictor and a preemptive SRTF scheduling policy for GPUs, significantly enhancing throughput and fairness in concurrent kernel execution.
Findings
SRTF improves system throughput by up to 2.25x over FIFO.
The online predictor estimates kernel runtime accurately after profiling a single thread block.
SRTF/Adaptive enhances fairness and reduces scheduling latency.
Abstract
Recent NVIDIA Graphics Processing Units (GPUs) can execute multiple kernels concurrently. On these GPUs, the thread block scheduler (TBS) uses the FIFO policy to schedule their thread blocks. We show that FIFO leaves performance to chance, resulting in significant loss of performance and fairness. To improve performance and fairness, we propose use of the preemptive Shortest Remaining Time First (SRTF) policy instead. Although SRTF requires an estimate of runtime of GPU kernels, we show that such an estimate of the runtime can be easily obtained using online profiling and exploiting a simple observation on GPU kernels' grid structure. Specifically, we propose a novel Structural Runtime Predictor. Using a simple Staircase model of GPU kernel execution, we show that the runtime of a kernel can be predicted by profiling only the first few thread blocks. We evaluate an online predictor…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Advanced Data Storage Technologies · Distributed and Parallel Computing Systems
