Efficient Realization of Givens Rotation through Algorithm-Architecture Co-design for Acceleration of QR Factorization
Farhad Merchant, Tarun Vatwani, Anupam Chattopadhyay, Soumyendu Raha,, S K Nandy, Ranjani Narayan, and Rainer Leupers

TL;DR
This paper introduces an efficient Givens Rotation-based QR factorization method using algorithm-architecture co-design, achieving significant performance improvements on multicore and GPGPU platforms.
Contribution
It presents a novel GGR implementation with macro operations on a reconfigurable data-path, leading to substantial speed-ups and energy efficiency over existing methods.
Findings
Achieves 3-100x better Gflops/watt performance.
GGR reduces multiplications by 33% compared to classical GR.
Outperforms GEMM by 10% in Gflops/watt.
Abstract
We present efficient realization of Generalized Givens Rotation (GGR) based QR factorization that achieves 3-100x better performance in terms of Gflops/watt over state-of-the-art realizations on multicore, and General Purpose Graphics Processing Units (GPGPUs). GGR is an improvement over classical Givens Rotation (GR) operation that can annihilate multiple elements of rows and columns of an input matrix simultaneously. GGR takes 33% lesser multiplications compared to GR. For custom implementation of GGR, we identify macro operations in GGR and realize them on a Reconfigurable Data-path (RDP) tightly coupled to pipeline of a Processing Element (PE). In PE, GGR attains speed-up of 1.1x over Modified Householder Transform (MHT) presented in the literature. For parallel realization of GGR, we use REDEFINE, a scalable massively parallel Coarse-grained Reconfigurable Architecture, and show…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmbedded Systems Design Techniques · Interconnection Networks and Systems · Video Coding and Compression Technologies
