TL;DR
This paper presents scalable parallel algorithms for sparse matrix-matrix multiplication and indexing, demonstrating high efficiency and near-linear scaling on thousands of processors for large-scale graph and linear algebra applications.
Contribution
It introduces a flexible, scalable parallel SpGEMM implementation with 2D block distribution and hypersparse kernels, enabling efficient distributed sparse matrix operations.
Findings
Achieves increasing speedup with unbounded processors
Scales efficiently up to thousands of processors
Demonstrates effectiveness in graph algorithms and linear solvers
Abstract
Generalized sparse matrix-matrix multiplication (or SpGEMM) is a key primitive for many high performance graph algorithms as well as for some linear solvers, such as algebraic multigrid. Here we show that SpGEMM also yields efficient algorithms for general sparse-matrix indexing in distributed memory, provided that the underlying SpGEMM implementation is sufficiently flexible and scalable. We demonstrate that our parallel SpGEMM methods, which use two-dimensional block data distributions with serial hypersparse kernels, are indeed highly flexible, scalable, and memory-efficient in the general case. This algorithm is the first to yield increasing speedup on an unbounded number of processors; our experiments show scaling up to thousands of processors in a variety of test scenarios.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
