Performance limitations for sparse matrix-vector multiplications on   current multicore environments

Gerald Schubert; Georg Hager; Holger Fehske

arXiv:0910.4836·cs.PF·March 1, 2012

Performance limitations for sparse matrix-vector multiplications on current multicore environments

Gerald Schubert, Georg Hager, Holger Fehske

PDF

TL;DR

This paper analyzes the performance bottlenecks of sparse matrix-vector multiplication on multicore processors, comparing different storage schemes and kernels to optimize parallel implementations.

Contribution

It provides a detailed performance analysis of sparse MVM on multicore systems, highlighting limitations and optimization strategies for different storage schemes.

Findings

01

Performance bottlenecks identified for sparse MVM on multicore systems

02

Comparison of cache-based and vector-oriented storage schemes

03

Insights into optimizing parallel sparse MVM implementations

Abstract

The increasing importance of multicore processors calls for a reevaluation of established numerical algorithms in view of their ability to profit from this new hardware concept. In order to optimize the existent algorithms, a detailed knowledge of the different performance-limiting factors is mandatory. In this contribution we investigate sparse matrix-vector multiplication, which is the dominant operation in many sparse eigenvalue solvers. Two conceptually different storage schemes and computational kernels have been conceived in the past to target cache-based and vector architectures, respectively. Starting from a series of microbenchmarks we apply the gained insight on optimized sparse MVM implementations, whose serial and OpenMP-parallel performance we review on state-of-the-art multicore systems.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.