On Parallel Solution of Sparse Triangular Linear Systems in CUDA

Ruipeng Li

arXiv:1710.04985·cs.MS·October 16, 2017·6 cites

On Parallel Solution of Sparse Triangular Linear Systems in CUDA

Ruipeng Li

PDF

Open Access

TL;DR

This paper presents new CUDA algorithms for efficiently solving sparse triangular linear systems, outperforming existing solvers by up to 2.6 times on structured and general sparse matrices.

Contribution

It introduces self-scheduling algorithms for parallel sparse triangular solves in CUDA, improving performance over existing level-scheduling methods.

Findings

01

CUDA algorithms outperform cuSPARSE solvers by up to 2.6x

02

Proposed methods are effective for both structured and general sparse matrices

03

Self-scheduling techniques enhance parallel efficiency in sparse triangular solves

Abstract

The acceleration of sparse matrix computations on modern many-core processors, such as the graphics processing units (GPUs), has been recognized and studied over a decade. Significant performance enhancements have been achieved for many sparse matrix computational kernels such as sparse matrix-vector products and sparse matrix-matrix products. Solving linear systems with sparse triangular structured matrices is another important sparse kernel as demanded by a variety of scientific and engineering applications such as sparse linear solvers. However, the development of efficient parallel algorithms in CUDA for solving sparse triangular linear systems remains a challenging task due to the inherently sequential nature of the computation. In this paper, we will revisit this problem by reviewing the existing level-scheduling methods and proposing algorithms with self-scheduling techniques.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMatrix Theory and Algorithms · Parallel Computing and Optimization Techniques · Distributed and Parallel Computing Systems