An Auto-tuning Method for Run-time Data Transformation for Sparse Matrix-Vector Multiplication
Takahiro Katagiri, Masahiko Sato

TL;DR
This paper presents an auto-tuning method for efficiently transforming sparse matrix data formats at run-time to optimize sparse matrix-vector multiplication performance, demonstrating significant speedups and low overhead.
Contribution
It introduces a novel auto-tuning approach using a graph model to select optimal data formats for sparse matrices during runtime.
Findings
ELL format achieves up to 151x speedup on Earth Simulator 2
Transformation overhead is minimal, between 0.01 and 1.0 times the SpMV time
The $D_{mat}^i$ - $R_{ell}^i$ graph effectively models transformation effectiveness
Abstract
In this paper, we research the run-time sparse matrix data transformation from Compressed Row Storage (CRS) to Coordinate (COO) storage and an ELL (ELLPACK/ITPACK) format with OpenMP parallelization for sparse matrix-vector multiplication (SpMV). We propose an auto-tuning (AT) method by using the - graph, which plots the derivation/average for the number of non-zero elements per row () and the ratio, SpMV speedups/transformation time from the CRS to ELL ( ). The experimental results show the ELL format is very effective in the Earth Simulator 2. The speedup factor of 151 with the ELL-Row inner-parallelized format is obtained. The transformation overhead is also very small, such as 0.01 to 1.0 SpMV time with the CRS format. In addition, the - graph can be modeled for the effectiveness of transformation according to the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Embedded Systems Design Techniques · Numerical Methods and Algorithms
