Comparing the Performance of Heterogeneous Conjugate Gradient and Cholesky Solvers on Various Hardware Using SYCL
Tim Th\"uring, Alexander Strack, Dirk Pfl\"uger

TL;DR
This paper presents heterogeneous CPU-GPU implementations of the Conjugate Gradient and Cholesky solvers using SYCL, demonstrating significant performance improvements over traditional GPU-only methods on various hardware.
Contribution
It introduces multi-vendor, heterogeneous implementations of key linear algebra solvers leveraging SYCL, and compares their performance to homogeneous approaches.
Findings
Heterogeneous implementations are up to 32% faster for CG and 29% faster for Cholesky on large matrices.
Cholesky heterogeneous implementation achieves at least 12% faster runtimes across different GPU vendors.
Performance gains are consistent across NVIDIA, AMD, and Intel systems.
Abstract
Many important real-world applications, such as System Identification with Gaussian Processes, involve solving linear systems with symmetric positive-definite matrices. The iterative CG method and direct solvers based on the Cholesky decomposition are two popular methods that can be applied in this case. Since often very large systems have to be solved when dealing with such real-world scenarios, GPUs are commonly used to accelerate the computations. However, homogeneous approaches that only leverage the GPU in the system do not take full advantage of the often powerful CPUs located in modern HPC systems. In this work, we present multi-vendor, heterogeneous implementations of the CG method and the Cholesky decomposition that leverage the CPU and GPU of a heterogeneous system simultaneously using SYCL. Furthermore, we compare their runtime behavior to traditional, homogeneous approaches.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
