Parallel Sub-Structuring Methods for solving Sparse Linear Systems on a cluster of GPU
Abal-Kassim Cheik Ahamed, Fr\'ed\'eric Magoul\`es

TL;DR
This paper develops a hybrid multi-GPU and CPU sub-structuring algorithm for solving sparse linear systems from PDE discretizations, demonstrating up to 19x speed-up with CUDA acceleration on GPU clusters.
Contribution
It introduces a novel hybrid multi-GPU and CPU sub-structuring method for parallel sparse linear system solutions, leveraging CUDA for acceleration.
Findings
Achieved up to 19x speed-up with CUDA on GPU clusters.
Compared GPU-accelerated implementation with CPU cluster, showing significant performance gains.
Validated the approach on matrices from engineering problems.
Abstract
The main objective of this work consists in analyzing sub-structuring method for the parallel solution of sparse linear systems with matrices arising from the discretization of partial differential equations such as finite element, finite volume and finite difference. With the success encountered by the general-purpose processing on graphics processing units (GPGPU), we develop an hybrid multiGPUs and CPUs sub-structuring algorithm. GPU computing, with CUDA, is used to accelerate the operations performed on each processor. Numerical experiments have been performed on a set of matrices arising from engineering problems. We compare C+MPI implementation on classical CPU cluster with C+MPI+CUDA on a cluster of GPU. The performance comparison shows a speed-up for the sub-structuring method up to 19 times in double precision by using CUDA.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
