Improving the Space-Time Efficiency of Processor-Oblivious Matrix Multiplication Algorithms
Yuan Tang

TL;DR
This paper introduces new processor-oblivious matrix multiplication algorithms that optimize time, space, and cache efficiency simultaneously, supported by theoretical analysis and empirical validation.
Contribution
It presents novel algorithms achieving sublinear time with optimal work, space, and cache bounds for general and Strassen-like matrix multiplication, improving upon classic methods.
Findings
Algorithms achieve sublinear time complexity.
Empirical results show advantages over traditional algorithms.
Provides new theoretical insights into cache-oblivious optimization.
Abstract
Classic cache-oblivious parallel matrix multiplication algorithms achieve optimality either in time or space, but not both, which promotes lots of research on the best possible balance or tradeoff of such algorithms. We study modern processor-oblivious runtime systems and figure out several ways to improve algorithm's time bound while still bounding space and cache requirements to be asymptotically optimal. By our study, we give out sublinear time, optimal work, space and cache algorithms for both general matrix multiplication on a semiring and Strassen-like fast algorithm. Our experiments also show such algorithms have empirical advantages over classic counterparts. Our study provides new insights and research angles on how to optimize cache-oblivious parallel algorithms from both theoretical and empirical perspectives.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Advanced Data Storage Technologies · Algorithms and Data Compression
