Efficiently Parallelizable Strassen-Based Multiplication of a Matrix by its Transpose
Viviana Arrigoni, Filippo Maggioli, Annalisa Massini, Emanuele, Rodol\`a

TL;DR
This paper introduces a new cache-oblivious, parallelizable algorithm based on Strassen's method for efficiently computing the product of a matrix and its transpose, applicable to both shared and distributed memory systems.
Contribution
It presents a novel ATA algorithm that reduces computational cost, exploits symmetry for memory savings, and demonstrates scalability and performance improvements over existing solutions.
Findings
Reduces computational cost to 14/3 n^{log2 7} operations.
Shows good scalability with matrix size and processes.
Achieves favorable performance in parallel and sequential implementations.
Abstract
The multiplication of a matrix by its transpose, , appears as an intermediate operation in the solution of a wide set of problems. In this paper, we propose a new cache-oblivious algorithm (ATA) for computing this product, based upon the classical Strassen algorithm as a sub-routine. In particular, we decrease the computational cost to the time required by Strassen's algorithm, amounting to floating point operations. ATA works for generic rectangular matrices, and exploits the peculiar symmetry of the resulting product matrix for saving memory. In addition, we provide an extensive implementation study of ATA in a shared memory system, and extend its applicability to a distributed environment. To support our findings, we compare our algorithm with state-of-the-art solutions specialized in the computation of . Our experiments…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMatrix Theory and Algorithms · Stochastic Gradient Optimization Techniques · Parallel Computing and Optimization Techniques
