Modeling Data Movement Performance on Heterogeneous Architectures
Amanda Bienz, Luke N. Olson, William D. Gropp, and Shelby Lockhart

TL;DR
This paper develops performance models for data movement on heterogeneous architectures, analyzing different communication paths and proposing an optimization that improves MPI collective operations by leveraging all CPU cores.
Contribution
It introduces detailed models for inter-node communication paths on heterogeneous systems and a novel optimization utilizing all CPU cores for enhanced performance.
Findings
Performance models accurately predict data movement costs.
Optimization improves MPI collective operation efficiency.
Utilizing all CPU cores yields significant performance gains.
Abstract
The cost of data movement on parallel systems varies greatly with machine architecture, job partition, and nearby jobs. Performance models that accurately capture the cost of data movement provide a tool for analysis, allowing for communication bottlenecks to be pinpointed. Modern heterogeneous architectures yield increased variance in data movement as there are a number of viable paths for inter-GPU communication. In this paper, we present performance models for the various paths of inter-node communication on modern heterogeneous architectures, including the trade-off between GPUDirect communication and copying to CPUs. Furthermore, we present a novel optimization for inter-node communication based on these models, utilizing all available CPU cores per node. Finally, we show associated performance improvements for MPI collective operations.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · Parallel Computing and Optimization Techniques · Distributed and Parallel Computing Systems
