Multi-threaded Sparse Matrix-Matrix Multiplication for Many-Core and GPU   Architectures

Mehmet Deveci; Christian Trott; Sivasankaran Rajamanickam

arXiv:1801.03065·cs.DC·January 10, 2018

Multi-threaded Sparse Matrix-Matrix Multiplication for Many-Core and GPU Architectures

Mehmet Deveci, Christian Trott, Sivasankaran Rajamanickam

PDF

TL;DR

This paper presents parallel algorithms for sparse matrix-matrix multiplication optimized for many-core and GPU architectures, emphasizing performance portability and data structure choices, with a meta-algorithm for adaptive selection.

Contribution

It introduces a meta-algorithm, kkSpGEMM, that adaptively selects algorithms and data structures based on problem characteristics for improved performance.

Findings

01

Performance varies with data structures used.

02

kkSpGEMM effectively chooses optimal algorithms.

03

Two-phase implementations are recommended for efficiency.

Abstract

Sparse Matrix-Matrix multiplication is a key kernel that has applications in several domains such as scientific computing and graph analysis. Several algorithms have been studied in the past for this foundational kernel. In this paper, we develop parallel algorithms for sparse matrix-matrix multiplication with a focus on performance portability across different high performance computing architectures. The performance of these algorithms depend on the data structures used in them. We compare different types of accumulators in these algorithms and demonstrate the performance difference between these data structures. Furthermore, we develop a meta-algorithm, kkSpGEMM, to choose the right algorithm and data structure based on the characteristics of the problem. We show performance comparisons on three architectures and demonstrate the need for the community to develop two phase sparse…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.