Efficient Realization of Householder Transform through Algorithm-Architecture Co-design for Acceleration of QR Factorization
Farhad Merchant, Tarun Vatwani, Anupam Chattopadhyay, Soumyendu Raha,, S K Nandy, and Ranjani Narayan

TL;DR
This paper introduces a co-designed algorithm-architecture approach to optimize Householder Transform-based QR factorization, achieving significant performance improvements through parallelism enhancements and custom hardware implementation.
Contribution
It proposes a modified Householder Transform (MHT) with higher parallelism, demonstrating improved performance on specialized hardware compared to classical methods.
Findings
MHT exhibits 1.33x higher parallelism than classical HT.
MHT achieves 1.3x better performance than classical HT on the same platform.
Custom MHT implementation outperforms optimized software packages by 12%.
Abstract
We present efficient realization of Householder Transform (HT) based QR factorization through algorithm-architecture co-design where we achieve performance improvement of 3-90x in-terms of Gflops/watt over state-of-the-art multicore, General Purpose Graphics Processing Units (GPGPUs), Field Programmable Gate Arrays (FPGAs), and ClearSpeed CSX700. Theoretical and experimental analysis of classical HT is performed for opportunities to exhibit higher degree of parallelism where parallelism is quantified as a number of parallel operations per level in the Directed Acyclic Graph (DAG) of the transform. Based on theoretical analysis of classical HT, an opportunity re-arrange computations in the classical HT is identified that results in Modified HT (MHT) where it is shown that MHT exhibits 1.33x times higher parallelism than classical HT. Experiments in off-the-shelf multicore and General…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
