Scalable Task-Based Algorithm for Multiplication of Block-Rank-Sparse Matrices
Justus A. Calvin, Cannada A. Lewis, and Edward F. Valeev

TL;DR
This paper introduces a scalable, task-based matrix multiplication algorithm tailored for block-rank-sparse matrices in quantum chemistry, improving efficiency and scalability over existing dense matrix methods.
Contribution
It presents a novel task-based formulation of SUMMA with concurrent scheduling and fine-grained tasks, enhancing load balancing and eliminating synchronization issues.
Findings
Demonstrates scalability in computing square-root inverse of block-rank-sparse matrices.
Outperforms state-of-the-art dense matrix multiplication implementations in certain cases.
Achieves efficient parallel performance for irregular matrix structures.
Abstract
A task-based formulation of Scalable Universal Matrix Multiplication Algorithm (SUMMA), a popular algorithm for matrix multiplication (MM), is applied to the multiplication of hierarchy-free, rank-structured matrices that appear in the domain of quantum chemistry (QC). The novel features of our formulation are: (1) concurrent scheduling of multiple SUMMA iterations, and (2) fine-grained task-based composition. These features make it tolerant of the load imbalance due to the irregular matrix structure and eliminate all artifactual sources of global synchronization.Scalability of iterative computation of square-root inverse of block-rank-sparse QC matrices is demonstrated; for full-rank (dense) matrices the performance of our SUMMA formulation usually exceeds that of the state-of-the-art dense MM implementations (ScaLAPACK and Cyclops Tensor Framework).
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
