Efficient Process-to-Node Mapping Algorithms for Stencil Computations
Sascha Hunold, Konrad von Kirchbach, Markus Lehr, Christian Schulz,, Jesper Larsson Tr\"aff

TL;DR
This paper introduces three novel distributed algorithms for process-to-node mapping in HPC applications with sparse stencil communication patterns, significantly improving speed and mapping quality over existing methods, leading to better overall performance.
Contribution
The paper presents new algorithms that exploit stencil structure for efficient process-to-node mapping, outperforming existing tools in speed and quality.
Findings
Algorithms are up to 100x faster than existing graph mapping tools.
Achieve similar communication performance with improved mapping quality.
Enable up to threefold performance gains in MPI communication operations.
Abstract
Good process-to-compute-node mappings can be decisive for well performing HPC applications. A special, important class of process-to-node mapping problems is the problem of mapping processes that communicate in a sparse stencil pattern to Cartesian grids. By thoroughly exploiting the inherently present structure in this type of problem, we devise three novel distributed algorithms that are able to handle arbitrary stencil communication patterns effectively. We analyze the expected performance of our algorithms based on an abstract model of inter- and intra-node communication. An extensive experimental evaluation on several HPC machines shows that our algorithms are up to two orders of magnitude faster in running time than a (sequential) high-quality general graph mapping tool, while obtaining similar results in communication performance. Furthermore, our algorithms also achieve…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Storage Technologies · Parallel Computing and Optimization Techniques · Distributed and Parallel Computing Systems
