High-Performance Partial Spectrum Computation for Symmetric eigenvalue problems and the SVD
D. Keyes, H. Ltaief, Y. Nakatsukasa, and D. Sukkari

TL;DR
This paper introduces a new QDWH-based method for efficiently computing partial spectra of symmetric eigenvalue problems and SVD, achieving significant speedups and better hardware utilization on large distributed systems.
Contribution
The paper presents a novel QDWH-based partial spectrum solver optimized for distributed systems, improving performance and hardware utilization over existing methods.
Findings
Speedup of up to 6X over ScaLAPACK for SVD
Speedup of up to 3.5X over ScaLAPACK for EIG
Better hardware occupancy and peak performance extraction
Abstract
Current dense symmetric eigenvalue (EIG) and singular value decomposition (SVD) implementations may suffer from the lack of concurrency during the tridiagonal and bidiagonal reductions, respectively. This performance bottleneck is typical for the two-sided transformations due to the Level-2 BLAS memory-bound calls. Therefore, the current state-of-the-art EIG and SVD implementations may achieve only a small fraction of the system's sustained peak performance. The QR-based Dynamically Weighted Halley (QDWH) algorithm may be used as a pre-processing step toward the EIG and SVD solvers, while mitigating the aforementioned bottleneck. QDWH-EIG and QDWH-SVD expose more parallelism, while relying on compute-bound matrix operations. Both run closer to the sustained peak performance of the system, but at the expense of performing more FLOPS than the standard EIG and SVD algorithms. In this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMatrix Theory and Algorithms · Parallel Computing and Optimization Techniques · Tensor decomposition and applications
