Optimizing Sparse Matrix-Vector Multiplication on Emerging Many-Core Architectures
Shizhao Chen, Jianbin Fang, Donglin Chen, Chuanfu Xu, Zheng Wang

TL;DR
This paper studies how different sparse matrix representations affect performance on emerging many-core architectures and introduces a machine learning model to predict the optimal representation for unseen inputs, improving efficiency.
Contribution
It provides the first comprehensive analysis of sparse matrix representations on KNL and FTP architectures and develops a predictive model to select optimal representations without runtime overhead.
Findings
Optimal representation varies with architecture and input.
Machine learning model predicts best representation with high accuracy.
Achieves 95% and 91% of optimal performance on KNL and FTP.
Abstract
Sparse matrix vector multiplication (SpMV) is one of the most common operations in scientific and high-performance applications, and is often responsible for the application performance bottleneck. While the sparse matrix representation has a significant impact on the resulting application performance, choosing the right representation typically relies on expert knowledge and trial and error. This paper provides the first comprehensive study on the impact of sparse matrix representations on two emerging many-core architectures: the Intel's Knights Landing (KNL) XeonPhi and the ARM-based FT-2000Plus (FTP). Our large-scale experiments involved over 9,500 distinct profiling runs performed on 956 sparse datasets and five mainstream SpMV representations. We show that the best sparse matrix representation depends on the underlying architecture and the program input. To help developers to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Advanced Data Storage Technologies · Ferroelectric and Negative Capacitance Devices
