Collectives in hybrid MPI+MPI code: design, practice and performance

Huan Zhou; Jose Gracia; Naweiluo Zhou; Ralf Schneider

arXiv:2007.11496·cs.DC·July 23, 2020

Collectives in hybrid MPI+MPI code: design, practice and performance

Huan Zhou, Jose Gracia, Naweiluo Zhou, Ralf Schneider

PDF

Open Access

TL;DR

This paper introduces a new design method for hybrid MPI+MPI collective communication that reduces on-node memory overhead and improves performance, validated through benchmarks and computational kernels.

Contribution

It proposes a novel design approach for MPI+MPI collective operations, including wrapper primitives and best practices, enhancing efficiency over traditional methods.

Findings

01

Micro-benchmarks show comparable or better performance than pure MPI.

02

Validated effectiveness in three computational kernels.

03

Reduces on-node communication overheads.

Abstract

The use of hybrid scheme combining the message passing programming models for inter-node parallelism and the shared memory programming models for node-level parallelism is widely spread. Existing extensive practices on hybrid Message Passing Interface (MPI) plus Open Multi-Processing (OpenMP) programming account for its popularity. Nevertheless, strong programming efforts are required to gain performance benefits from the MPI+OpenMP code. An emerging hybrid method that combines MPI and the MPI shared memory model (MPI+MPI) is promising. However, writing an efficient hybrid MPI+MPI program -- especially when the collective communication operations are involved -- is not to be taken for granted. In this paper, we propose a new design method to implement hybrid MPI+MPI context-based collective communication operations. Our method avoids on-node memory replications (on-node communication…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsParallel Computing and Optimization Techniques · Distributed and Parallel Computing Systems · Advanced Data Storage Technologies