Basker: A Threaded Sparse LU Factorization Utilizing Hierarchical   Parallelism and Data Layouts

Joshua Dennis Booth; Sivasankaran Rajamanickam; Heidi K. Thornquist

arXiv:1601.05725·cs.DC·January 22, 2016

Basker: A Threaded Sparse LU Factorization Utilizing Hierarchical Parallelism and Data Layouts

Joshua Dennis Booth, Sivasankaran Rajamanickam, Heidi K. Thornquist

PDF

Open Access

TL;DR

Basker is a scalable sparse LU factorization solver that leverages hierarchical parallelism and data layouts to improve performance on modern architectures, outperforming existing solvers significantly.

Contribution

It introduces a new parallel algorithm for sparse LU factorization that aligns with hierarchical hardware architectures and memory layouts.

Findings

01

Achieves up to 7.4x speedup on Xeon Phi compared to KLU.

02

Outperforms Intel MKL Pardiso by up to 53x on CPU.

03

Provides 5.4x speedup on challenging circuit matrices.

Abstract

Scalable sparse LU factorization is critical for efficient numerical simulation of circuits and electrical power grids. In this work, we present a new scalable sparse direct solver called Basker. Basker introduces a new algorithm to parallelize the Gilbert-Peierls algorithm for sparse LU factorization. As architectures evolve, there exists a need for algorithms that are hierarchical in nature to match the hierarchy in thread teams, individual threads, and vector level parallelism. Basker is designed to map well to this hierarchy in architectures. There is also a need for data layouts to match multiple levels of hierarchy in memory. Basker uses a two-dimensional hierarchical structure of sparse matrices that maps to the hierarchy in the memory architectures and to the hierarchy in parallelism. We present performance evaluations of Basker on the Intel SandyBridge and Xeon Phi platforms…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMatrix Theory and Algorithms · Parallel Computing and Optimization Techniques · Interconnection Networks and Systems