GSoFa: Scalable Sparse Symbolic LU Factorization on GPUs

Anil Gaihre; Xiaoye S. Li; Hang Liu

arXiv:2007.00840·cs.DC·May 11, 2021·1 cites

GSoFa: Scalable Sparse Symbolic LU Factorization on GPUs

Anil Gaihre, Xiaoye S. Li, Hang Liu

PDF

Open Access

TL;DR

This paper presents gSoFa, a GPU-based symbolic LU factorization method that significantly accelerates sparse matrix decomposition, achieving up to 31x speedup and better memory efficiency compared to CPU approaches.

Contribution

gSoFa introduces the first GPU-optimized symbolic LU factorization algorithm with novel parallelization, workload balancing, and space reduction techniques for sparse matrices.

Findings

01

Up to 31x speedup on Summit supercomputers.

02

Outperforms state-of-the-art CPU methods by 5x on average.

03

Achieves 47% of peak memory throughput of V100 GPUs.

Abstract

Decomposing matrix A into a lower matrix L and an upper matrix U, which is also known as LU decomposition, is an essential operation in numerical linear algebra. For a sparse matrix, LU decomposition often introduces more nonzero entries in the L and U factors than in the original matrix. A symbolic factorization step is needed to identify the nonzero structures of L and U matrices. Attracted by the enormous potentials of the Graphics Processing Units (GPUs), an array of efforts have surged to deploy various LU factorization steps except for the symbolic factorization, to the best of our knowledge, on GPUs. This paper introduces gSoFa, the first GPU-based Symbolic factorization design with the following three optimizations to enable scalable LU symbolic factorization for nonsymmetric pattern sparse matrices on GPUs. First, we introduce a novel fine-grained parallel symbolic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsParallel Computing and Optimization Techniques · Matrix Theory and Algorithms · Interconnection Networks and Systems