TL;DR
COSTA is a highly optimized, communication-efficient algorithm for data reshuffling and transpose operations in distributed linear algebra, leveraging process relabeling to minimize communication costs and improve performance in heterogeneous systems.
Contribution
The paper introduces COSTA, a novel algorithm that uses process relabeling to achieve communication-optimal data reshuffling and transpose operations in distributed systems.
Findings
Outperforms existing ScaLAPACK routines in speed.
Effectively handles heterogeneous network topologies.
Enables integration of optimal matrix multiplication in applications.
Abstract
Communication-avoiding algorithms for Linear Algebra have become increasingly popular, in particular for distributed memory architectures. In practice, these algorithms assume that the data is already distributed in a specific way, thus making data reshuffling a key to use them. For performance reasons, a straightforward all-to-all exchange must be avoided. Here, we show that process relabeling (i.e. permuting processes in the final layout) can be used to obtain communication optimality for data reshuffling, and that it can be efficiently found by solving a Linear Assignment Problem (Maximum Weight Bipartite Perfect Matching). Based on this, we have developed a Communication-Optimal Shuffle and Transpose Algorithm (COSTA): this highly-optimised algorithm implements $A=\alpha\cdot \operatorname{op}(B) + \beta \cdot A,\ \operatorname{op} \in \{\operatorname{transpose},…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
