Balanced Partitioning of Several Cache-Oblivious Algorithms
Yuan Tang, Weiguo Gao

TL;DR
This paper introduces PACO, a novel cache-oblivious, processor-aware partitioning method that achieves perfect strong scaling for several algorithms, including Strassen's, on arbitrary processor counts, improving scalability and cache efficiency.
Contribution
It presents a new partitioning technique, PACO, enabling scalable parallel cache-oblivious algorithms on arbitrary processor counts, including prime numbers, with demonstrated improvements.
Findings
PACO achieves near-perfect strong scaling for multiple algorithms.
PACO algorithms outperform state-of-the-art in scalability and cache complexity.
Preliminary experiments confirm significant performance gains over existing methods.
Abstract
Frigo et al. proposed an ideal cache model and a recursive technique to design sequential cache-efficient algorithms in a cache-oblivious fashion. Ballard et al. pointed out that it is a fundamental open problem to extend the technique to an arbitrary architecture. Ballard et al. raised another open question on how to parallelize Strassen's algorithm exactly and efficiently on an arbitrary number of processors. We propose a novel way of partitioning a cache-oblivious algorithm to achieve perfect strong scaling on an arbitrary number, even a prime number, of processors within a certain range in a shared-memory setting. Our approach is Processor-Aware but Cache-Oblivious (PACO). We demonstrate our approach on several important cache-oblivious algorithms, including LCS, 1D, GAP, classic rectangular matrix multiplication on a semiring, and Strassen's algorithm. We discuss how to extend…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Stochastic Gradient Optimization Techniques · Advanced Data Storage Technologies
