PackSELL: A Sparse Matrix Format for Precision-Agnostic High-Performance SpMV
Kengo Suzuki, Takeshi Iwashita

TL;DR
PackSELL introduces a flexible sparse matrix format optimized for GPU-based SpMV, reducing memory use and boosting performance across various precisions, including non-IEEE formats.
Contribution
It presents a novel, configurable sparse matrix format that improves GPU SpMV efficiency and supports diverse data representations, including custom formats.
Findings
PackSELL outperforms cuSPARSE SELL-based kernel by up to 1.63× at FP16.
Configurable formats in PackSELL achieve FP32 accuracy with higher performance.
Using PackSELL in PCG solvers yields up to 2.09× speedup over full-precision implementations.
Abstract
We propose a new sparse matrix format, PackSELL, designed to support diverse data representations and enable efficient sparse matrix-vector multiplication (SpMV) on GPUs. Building on sliced ELLPACK (SELL), PackSELL incorporates delta encoding of column indices and a novel packing scheme that stores each index-delta-value pair in a single word, thereby reducing memory footprint and data movement. This design further enables fine-grained control over the bit allocation between deltas and values, allowing flexible data representations, including non-IEEE formats. Experimental results show that, when configured for half precision (FP16), the PackSELL-based SpMV kernel outperforms the cuSPARSE SELL-based kernel by up to . Moreover, with configurations using customized formats, PackSELL achieves FP32-level accuracy while exceeding the performance of FP16 cuSPARSE. These benefits…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
