Shared Virtual Memory: Its Design and Performance Implications for Diverse Applications
Bennett Cooper, Thomas R. W. Scogland, Rong Ge

TL;DR
This paper investigates Shared Virtual Memory (SVM) in GPUs, analyzing its design, performance impacts, and proposing mitigation strategies to improve application efficiency in oversubscribed scenarios.
Contribution
It provides the first comprehensive analysis of SVM design, its interaction with applications, and strategies to mitigate performance degradation due to prefetching and eviction policies.
Findings
SVM employs aggressive prefetching for demand paging.
Prefetching is efficient without oversubscription, but causes thrashing under oversubscription.
Proposed algorithms and design changes can mitigate performance issues.
Abstract
Discrete GPU accelerators, while providing massive computing power for supercomputers and data centers, have their separate memory domain. Explicit memory management across device and host domains in programming is tedious and error-prone. To improve programming portability and productivity, Unified Memory (UM) integrates GPU memory into the host virtual memory systems, and provides transparent data migration between them and GPU memory oversubscription. Nevertheless, current UM technologies cause significant performance loss for applications. With AMD GPUs increasingly being integrated into the world's leading supercomputers, it is necessary to understand their Shared Virtual Memory (SVM) and mitigate the performance impacts. In this work, we delve into the SVM design, examine its interactions with applications' data accesses at fine granularity, and quantitatively analyze its…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
