Communication-Avoiding SpGEMM via Trident Partitioning on Hierarchical GPU Interconnects
Julian Bellavita, Lorenzo Pichetti, Thomas Pasquali, Flavio Vella, Giulia Guidi

TL;DR
This paper introduces Trident, a hierarchy-aware 2D distributed SpGEMM algorithm that reduces communication and improves performance on hierarchical GPU interconnects by exploiting intra-node bandwidth advantages.
Contribution
The paper presents a novel trident partitioning scheme and communication-avoiding techniques tailored for hierarchical GPU architectures, significantly enhancing sparse matrix multiplication efficiency.
Findings
Up to 2.38x speedup over traditional 2D SpGEMM
Internode communication volume reduced by up to 2x
Effective acceleration of Markov Clustering tasks
Abstract
The multiplication of two sparse matrices, known as SpGEMM, is a key kernel in scientific computing and large-scale data analytics, underpinning graph algorithms, machine learning, simulations, and computational biology, where sparsity is often highly unstructured. The unstructured sparsity makes achieving high performance challenging because it limits both memory efficiency and scalability. In distributed memory, the cost of exchanging and merging partial products across nodes further constrains performance. These issues are exacerbated on modern heterogeneous supercomputers with deep, hierarchical GPU interconnects. Current SpGEMM implementations overlook the gap between intra-node and inter-node bandwidth, resulting in unnecessary data movement and synchronization not fully exploiting the fast intra-node interconnect. To address these challenges, we introduce Trident, a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGraph Theory and Algorithms · Parallel Computing and Optimization Techniques · Interconnection Networks and Systems
