Improving Multi-Application Concurrency Support Within the GPU Memory System
Rachata Ausavarungnirun, Christopher J. Rossbach, Vance Miller, Joshua, Landgraf, Saugata Ghose, Jayneel Gnadhi, Adwait Jog, Onur Mutlu

TL;DR
This paper identifies the memory system as a key bottleneck in multi-application GPU execution and proposes MASK, a new memory hierarchy design that improves virtual memory support and reduces contention.
Contribution
The paper introduces MASK, a novel GPU memory hierarchy extension that enhances multi-application concurrency by reducing TLB contention and improving address translation efficiency.
Findings
MASK reduces TLB miss rates significantly.
Improves GPU throughput during multi-application workloads.
Decreases inter-core thrashing in GPU memory system.
Abstract
GPUs exploit a high degree of thread-level parallelism to hide long-latency stalls. Due to the heterogeneous compute requirements of different applications, there is a growing need to share the GPU across multiple applications in large-scale computing environments. However, while CPUs offer relatively seamless multi-application concurrency, and are an excellent fit for multitasking and for virtualized environments, GPUs currently offer only primitive support for multi-application concurrency. Much of the problem in a contemporary GPU lies within the memory system, where multi-application execution requires virtual memory support to manage the address spaces of each application and to provide memory protection. In this work, we perform a detailed analysis of the major problems in state-of-the-art GPU virtual memory management that hinders multi-application execution. Existing GPUs are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Distributed and Parallel Computing Systems · Advanced Data Storage Technologies
