Explicit caching HYB: a new high-performance SpMV framework on GPGPU
Chong Chen

TL;DR
This paper introduces EHYB, a new GPU-based SpMV framework that explicitly caches input vectors and optimizes data movement, significantly improving performance over existing methods in finite element method applications.
Contribution
The paper presents a novel explicit caching framework for GPU SpMV that reduces data movement and enhances performance beyond current state-of-the-art approaches.
Findings
EHYB outperforms existing GPU SpMV implementations.
Significant speedup achieved through explicit input vector caching.
Higher FLOPs than theoretical performance bounds.
Abstract
Sparse Matrix-Vector Multiplication (SpMV) is a critical operation for the iterative solver of Finite Element Methods on computer simulation. Since the SpMV operation is a memory-bound algorithm, the efficiency of data movements heavily influenced the performance of the SpMV on GPU. In recent years, many research is conducted in accelerating the performance of SpMV on the graphic processing units (GPU). The performance optimization methods used in existing studies focus on the following areas: improve the load balancing between GPU processors, and reduce the execution divergence between GPU threads. Although some studies have made preliminary optimization on the input vector fetching, the effect of explicitly caching the input vector on GPU base SpMV has not been studied in depth yet. In this study, we are trying to minimize the data movements cost for GPU-based SpMV using a new…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Matrix Theory and Algorithms · Embedded Systems Design Techniques
