Recent Advances in Overcoming Bottlenecks in Memory Systems and Managing Memory Resources in GPU Systems
Onur Mutlu, Saugata Ghose, Rachata Ausavarungnirun

TL;DR
This paper reviews recent research on memory system bottlenecks in GPU systems, focusing on contention, latency, and resource management to improve performance and efficiency.
Contribution
It provides an extended summary of innovative solutions addressing memory interference, latency, and resource management challenges in GPU systems.
Findings
Memory contention significantly impacts GPU performance.
New techniques reduce memory latency and overheads.
Improved memory management enhances energy efficiency.
Abstract
This article features extended summaries and retrospectives of some of the recent research done by our research group, SAFARI, on (1) various critical problems in memory systems and (2) how memory system bottlenecks affect graphics processing unit (GPU) systems. As more applications share a single system, operations from each application can contend with each other at various shared components. Such contention can slow down each application or thread of execution. The compound effect of contention, high memory latency and access overheads, as well as inefficient management of resources, greatly degrades performance, quality-of-service, and energy efficiency. The ten works featured in this issue study several aspects of (1) inter-application interference in multicore systems, heterogeneous systems, and GPUs; (2) the growing overheads and expenses associated with growing memory densities…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Advanced Data Storage Technologies · Cloud Computing and Resource Management
