QR factorization of ill-conditioned tall-and-skinny matrices on distributed-memory systems
Nenad Miji\'c, Abhiram Kaushik, Davor Davidovi\'c

TL;DR
This paper introduces a distributed-memory algorithm for QR factorization of extremely ill-conditioned tall-and-skinny matrices, combining communication-avoiding techniques with improved numerical stability and high performance on CPU and GPU systems.
Contribution
The paper presents a novel distributed implementation of a stable QR factorization algorithm for ill-conditioned matrices, interleaving CholeskyQR2 with Gram-Schmidt to enhance stability and performance.
Findings
Achieves numerical stability comparable to state-of-the-art methods.
Outperforms existing algorithms by up to 80x on GPU systems.
Demonstrates significant performance improvements in weak scaling tests.
Abstract
In this paper we present a novel algorithm developed for computing the QR factorisation of extremely ill-conditioned tall-and-skinny matrices on distributed memory systems. The algorithm is based on the communication-avoiding CholeskyQR2 algorithm and its block Gram-Schmidt variant. The latter improves the numerical stability of the CholeskyQR2 algorithm and significantly reduces the loss of orthogonality even for matrices with condition numbers up to . Currently, there is no distributed GPU version of this algorithm available in the literature which prevents the application of this method to very large matrices. In our work we provide a distributed implementation of this algorithm and also introduce a modified version that improves the performance, especially in the case of extremely ill-conditioned matrices. The main innovation of our approach lies in the interleaving of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsQuantum Computing Algorithms and Architecture · Neural Networks and Applications · DNA and Biological Computing
