Holistic Management of the GPGPU Memory Hierarchy to Manage Warp-level Latency Tolerance
Rachata Ausavarungnirun, Saugata Ghose, Onur Kay{\i}ran, Gabriel H., Loh, Chita R. Das, Mahmut T. Kandemir, Onur Mutlu

TL;DR
This paper introduces Memory Divergence Correction (MeDiC), a set of techniques to mitigate warp-level memory divergence and cache queuing delays in GPGPU architectures, significantly improving performance and energy efficiency.
Contribution
The paper presents novel insights into warp memory divergence behavior and proposes MeDiC, a comprehensive approach to reduce divergence effects and cache queuing delays in GPGPU memory management.
Findings
MeDiC achieves 21.8% average speedup.
MeDiC improves energy efficiency by 20.1%.
Effective across diverse GPGPU applications.
Abstract
In a modern GPU architecture, all threads within a warp execute the same instruction in lockstep. For a memory instruction, this can lead to memory divergence: the memory requests for some threads are serviced early, while the remaining requests incur long latencies. This divergence stalls the warp, as it cannot execute the next instruction until all requests from the current instruction complete. In this work, we make three new observations. First, GPGPU warps exhibit heterogeneous memory divergence behavior at the shared cache: some warps have most of their requests hit in the cache, while other warps see most of their request miss. Second, a warp retains the same divergence behavior for long periods of execution. Third, requests going to the shared cache can incur queuing delays as large as hundreds of cycles, exacerbating the effects of memory divergence. We propose a set of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Advanced Data Storage Technologies · Embedded Systems Design Techniques
