Performance Impact of Memory Channels on Sparse and Irregular Algorithms
Oded Green, James Fox, Jeffrey Young, Jun Shirako, David Bader

TL;DR
This paper investigates how the number of memory channels, rather than raw bandwidth or latency, significantly influences the performance of graph algorithms on various hardware architectures.
Contribution
It demonstrates that memory channel count is a key performance factor in graph processing, challenging the common focus on bandwidth and latency.
Findings
Memory channels impact graph algorithm performance more than bandwidth.
GPU and CPU systems show performance variation based on memory channel utilization.
Adding more narrower memory channels can improve performance in graph analytics.
Abstract
Graph processing is typically considered to be a memory-bound rather than compute-bound problem. One common line of thought is that more available memory bandwidth corresponds to better graph processing performance. However, in this work we demonstrate that the key factor in the utilization of the memory system for graph algorithms is not necessarily the raw bandwidth or even the latency of memory requests. Instead, we show that performance is proportional to the number of memory channels available to handle small data transfers with limited spatial locality. Using several widely used graph frameworks, including Gunrock (on the GPU) and GAPBS \& Ligra (for CPUs), we evaluate key graph analytics kernels using two unique memory hierarchies, DDR-based and HBM/MCDRAM. Our results show that the differences in the peak bandwidths of several Pascal-generation GPU memory subsystems aren't…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
