Communication-avoiding Cholesky-QR2 for rectangular matrices

Edward Hutter; Edgar Solomonik

arXiv:1710.08471·cs.DC·June 18, 2019

Communication-avoiding Cholesky-QR2 for rectangular matrices

Edward Hutter, Edgar Solomonik

PDF

TL;DR

This paper presents a communication-avoiding parallel CholeskyQR2 algorithm for rectangular matrices, significantly reducing interprocessor communication and improving scalability on supercomputers for QR factorization tasks.

Contribution

It introduces a generalized parallelization of CholeskyQR2 over a 3D processor grid, achieving lower communication costs and demonstrating superior performance over existing methods.

Findings

01

Achieves up to 6 times less interprocessor communication.

02

Faster than ScaLAPACK's QR by up to 3.3x on large-scale systems.

03

Effective scalability demonstrated on supercomputers.

Abstract

Scalable QR factorization algorithms for solving least squares and eigenvalue problems are critical given the increasing parallelism within modern machines. We introduce a more general parallelization of the CholeskyQR2 algorithm and show its effectiveness for a wide range of matrix sizes. Our algorithm executes over a 3D processor grid, the dimensions of which can be tuned to trade-off costs in synchronization, interprocessor communication, computational work, and memory footprint. We implement this algorithm, yielding a code that can achieve a factor of $Θ (P^{1/6})$ less interprocessor communication on $P$ processors than any previous parallel QR implementation. Our performance study on Intel Knights-Landing and Cray XE supercomputers demonstrates the effectiveness of this CholeskyQR2 parallelization on a large number of nodes. Specifically, relative to ScaLAPACK's QR, on 1024…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.