On the performance of two-sided MPI, MPI-3 RMA and SHMEM in a Lagrangian particle cluster algorithm
Matthias Frey, Douglas Shanks, Steven B\"oing, Rui F. G. Ap\'ostolo

TL;DR
This paper evaluates the parallel performance of MPI, MPI-3 RMA, and SHMEM communication models in a Lagrangian particle cluster algorithm for N-body simulations, focusing on scalability and efficiency across different systems.
Contribution
It provides a comparative analysis of three communication models in a specific cluster algorithm, highlighting their performance in geophysical fluid flow simulations.
Findings
MPI-3 RMA outperforms MPI in scalability.
SHMEM shows competitive performance on certain systems.
Performance varies significantly with hardware interconnects.
Abstract
In this paper, we compare the parallel performance of three distributed-memory communication models for a cluster algorithm based on a nearest neighbour search algorithm for N-body simulations. The nearest neighbour is defined by the Euclidean distance in three-dimensional space. The resulting directed nearest neighbour graphs that are used to define the clusters are pruned in an iterative procedure where we use either point-to-point message passing interface (MPI), MPI-3 remote memory access (RMA), or SHMEM communication. The original algorithm has been developed and implemented as part of the elliptical parcel-in-cell (EPIC) method targeting geophysical fluid flows. The parallel scalability of the algorithm is discussed by means of an artificial and a standard fluid dynamics test case. Performance measurements were carried out on three different computing systems with InfiniBand FDR,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Clustering Algorithms Research
