Efficient executions of Pipelined Conjugate Gradient Method on   Heterogeneous Architectures

Manasi Tiwari; Sathish Vadhiyar

arXiv:2105.06176·cs.DC·May 14, 2021

Efficient executions of Pipelined Conjugate Gradient Method on Heterogeneous Architectures

Manasi Tiwari, Sathish Vadhiyar

PDF

Open Access

TL;DR

This paper presents three novel methods for efficiently executing the Pipelined Conjugate Gradient algorithm on heterogeneous CPU-GPU architectures, achieving significant speedups over existing CPU and GPU implementations.

Contribution

It introduces task-parallelism and data parallelism strategies, including a performance model-based workload decomposition, for the Pipelined PCG method on heterogeneous systems.

Findings

01

Up to 8x speedup over CPU implementations

02

Up to 5x speedup over GPU implementations

03

Effective handling of large problems exceeding GPU memory

Abstract

The Preconditioned Conjugate Gradient (PCG) method is widely used for solving linear systems of equations with sparse matrices. A recent version of PCG, Pipelined PCG, eliminates the dependencies in the computations of the PCG algorithm so that the non-dependent computations can be overlapped with communication. In this paper, we propose three methods for efficient execution of the Pipelined PCG algorithm on GPU accelerated heterogeneous architectures. The first two methods achieve task-parallelism using asynchronous executions of different tasks on CPU cores and GPU. The third method achieves data parallelism by decomposing the workload between CPU and GPU based on a performance model. The performance model takes into account the relative performance of CPU cores and GPU using some initial executions and performs 2D data decomposition. We also implement optimization strategies like…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMatrix Theory and Algorithms · Tensor decomposition and applications · Electromagnetic Scattering and Analysis