Optimizing Graph Processing and Preprocessing with Hardware Assisted Propagation Blocking
Vignesh Balaji, Brandon Lucia

TL;DR
This paper introduces COBRA, a hardware-assisted architecture that enhances Propagation Blocking for graph processing and pre-processing, achieving significant speedups by eliminating bottlenecks on conventional multicores.
Contribution
The paper proposes COBRA, a novel architecture that supports hardware-assisted Propagation Blocking, improving performance of graph analytics and pre-processing tasks.
Findings
End-to-end speedups of up to 4.6x achieved.
Performance gains are consistent across various graph kernels.
Hardware support effectively eliminates PB bottlenecks.
Abstract
Extensive prior research has focused on alleviating the characteristic poor cache locality of graph analytics workloads. However, graph pre-processing tasks remain relatively unexplored. In many important scenarios, graph pre-processing tasks can be as expensive as the downstream graph analytics kernel. We observe that Propagation Blocking (PB), a software optimization designed for SpMV kernels, generalizes to many graph analytics kernels as well as common pre-processing tasks. In this work, we identify the lingering inefficiencies of a PB execution on conventional multicores and propose architecture support to eliminate PB's bottlenecks, further improving the performance gains from PB. Our proposed architecture -- COBRA -- optimizes the PB execution of both graph processing and pre-processing alike to provide end-to-end speedups of up to 4.6x (3.5x on average).
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGraph Theory and Algorithms · Parallel Computing and Optimization Techniques · Advanced Graph Neural Networks
