Irregular Accesses Reorder Unit: Improving GPGPU Memory Coalescing for Graph-Based Workloads
Albert Segura, Jose-Maria Arnau, Antonio Gonzalez

TL;DR
The paper introduces the IRU, a hardware extension for GPGPU architectures that reorders irregular data accesses to improve memory coalescing, boosting performance and energy efficiency for graph-based workloads.
Contribution
It presents the IRU, a novel hardware unit that enhances memory coalescing and reduces memory traffic in GPGPU for irregular applications, with simple programmability.
Findings
Memory coalescing improved by 1.32x
Memory traffic reduced by 46%
Performance increased by 1.33x
Abstract
GPGPU architectures have become established as the dominant parallelization and performance platform achieving exceptional popularization and empowering domains such as regular algebra, machine learning, image detection and self-driving cars. However, irregular applications struggle to fully realize GPGPU performance as a result of control flow divergence and memory divergence due to irregular memory access patterns. To ameliorate these issues, programmers are obligated to carefully consider architecture features and devote significant efforts to modify the algorithms with complex optimization techniques, which shift programmers priorities yet struggle to quell the shortcomings. We show that in graph-based GPGPU irregular applications these inefficiencies prevail, yet we find that it is possible to relax the strict relationship between thread and data processed to empower new…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Cloud Computing and Resource Management · Graph Theory and Algorithms
