Leveraging Caliper and Benchpark to Analyze MPI Communication Patterns: Insights from AMG2023, Kripke, and Laghos
Grace Nansamba, Evelyn Namugwanya, David Boehme, Dewi Yokelson, Riley Shipley, Derek Schafer, Michael McKinsey, Olga Pearce, Anthony Skjellum

TL;DR
This paper introduces communication regions into Caliper HPC profiling to capture detailed MPI communication metrics, providing new insights into communication patterns and bottlenecks in HPC applications.
Contribution
The paper presents a novel extension to Caliper for capturing communication-specific metrics, enabling detailed analysis of MPI communication behaviors in HPC applications.
Findings
Enhanced visualization of MPI communication patterns.
Identification of communication bottlenecks.
Insights into scalability differences on CPU and GPU systems.
Abstract
We introduce ``communication regions'' into the widely used Caliper HPC profiling tool. A communication region is an annotation enabling capture of metrics about the data being communicated (including statistics of these metrics), and metrics about the MPI processes involved in the communications, something not previously possible in Caliper. We explore the utility of communication regions with three representative modeling and simulation applications, AMG2023, Kripke, and Laghos, all part of the comprehensive Benchpark suite that includes Caliper annotations. Enhanced Caliper reveals detailed communication behaviors. Using Caliper and Thicket in tandem, we create new visualizations of MPI communication patterns, including halo exchanges. Our findings reveal communication bottlenecks and detailed behaviors, indicating significant utility of the special-regions addition to Caliper. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
