Model-guided Performance Analysis of the Sparse Matrix-Matrix Multiplication
Tobias Scharpff, Klaus Iglberger, Georg Hager, Ulrich Ruede

TL;DR
This paper analyzes the performance of sparse matrix-matrix multiplication kernels within the Blaze framework, develops models to estimate maximum performance, and compares implementations with other C++ libraries, demonstrating competitive or superior efficiency.
Contribution
It introduces simple performance models for sparse matrix multiplication and evaluates Blaze's implementations against other libraries, highlighting their efficiency.
Findings
Blaze's sparse matrix multiplication is competitive or faster for most problem sizes.
Performance models effectively estimate achievable maximum performance.
Blaze's implementations outperform other SET libraries in many cases.
Abstract
Achieving high efficiency with numerical kernels for sparse matrices is of utmost importance, since they are part of many simulation codes and tend to use most of the available compute time and resources. In addition, especially in large scale simulation frameworks the readability and ease of use of mathematical expressions are essential components for the continuous maintenance, modification, and extension of software. In this context, the sparse matrix-matrix multiplication is of special interest. In this paper we thoroughly analyze the single-core performance of sparse matrix-matrix multiplication kernels in the Blaze Smart Expression Template (SET) framework. We develop simple models for estimating the achievable maximum performance, and use them to assess the efficiency of our implementations. Additionally, we compare these kernels with several commonly used SET-based C++…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Advanced Data Storage Technologies · Distributed and Parallel Computing Systems
