Sparbit: a new logarithmic-cost and data locality-aware MPI Allgather algorithm
Wilton Jaciel Loch, Guilherme Pi\^egas Koslovski

TL;DR
Sparbit is a new MPI Allgather algorithm that optimizes data locality and reduces communication costs, significantly outperforming traditional methods in many scenarios for high-performance computing.
Contribution
It introduces the Sparbit algorithm, combining logarithmic cost and data locality awareness, improving MPI Allgather performance without restrictions.
Findings
Surpassed traditional MPI algorithms in 46.43% of test cases.
Achieved mean improvement of 34.7% and median of 26.16%.
Reached up to 84.16% performance gain.
Abstract
The collective operations are considered critical for improving the performance of exascale-ready and high-performance computing applications. On this paper we focus on the Message-Passing Interface (MPI) Allgather many to many collective, which is amongst the most called and time-consuming operations. Each MPI algorithm for this call suffers from different operational and performance limitations, that might include only working for restricted cases, requiring linear amounts of communication steps with the growth in number of processes, memory copies and shifts to assure correct data organization, and non-local data exchange patterns, most of which negatively contribute to the total operation time. All these characteristics create an environment where there is no algorithm which is the best for all cases and this consequently implies that careful choices of alternatives must be made to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed systems and fault tolerance · Parallel Computing and Optimization Techniques · Distributed and Parallel Computing Systems
