Kernelet: High-Throughput GPU Kernel Executions with Dynamic Slicing and Scheduling
Jianlong Zhong, Bingsheng He

TL;DR
Kernelet is a runtime system that enhances GPU throughput in shared environments by dynamically slicing kernels into tunable sub-kernels and scheduling them based on a Markov chain model, significantly improving performance.
Contribution
It introduces a novel dynamic slicing and scheduling approach for concurrent GPU kernels, guided by a Markov chain performance model, to optimize throughput.
Findings
Achieves up to 31.1% performance improvement on Tesla C2050.
Achieves up to 23.4% performance improvement on GTX680.
Effectively utilizes GPU resources in shared environments.
Abstract
Graphics processors, or GPUs, have recently been widely used as accelerators in the shared environments such as clusters and clouds. In such shared environments, many kernels are submitted to GPUs from different users, and throughput is an important metric for performance and total ownership cost. Despite the recently improved runtime support for concurrent GPU kernel executions, the GPU can be severely underutilized, resulting in suboptimal throughput. In this paper, we propose Kernelet, a runtime system with dynamic slicing and scheduling techniques to improve the throughput of concurrent kernel executions on the GPU. With slicing, Kernelet divides a GPU kernel into multiple sub-kernels (namely slices). Each slice has tunable occupancy to allow co-scheduling with other slices and to fully utilize the GPU resources. We develop a novel and effective Markov chain based performance model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Cloud Computing and Resource Management · Advanced Data Storage Technologies
