Trace-based, time-resolved analysis of MPI application performance using standard metrics

Kingshuk Haldar

arXiv:2512.01764·cs.DC·December 2, 2025

Trace-based, time-resolved analysis of MPI application performance using standard metrics

Kingshuk Haldar

PDF

Open Access

TL;DR

This paper introduces a method for detailed, time-resolved analysis of MPI application performance metrics from execution traces, revealing transient bottlenecks often hidden in aggregated data.

Contribution

It presents a novel approach to compute standard MPI metrics over fixed or adaptive time segments, enabling detailed performance analysis even with large trace sizes.

Findings

01

Time-resolved metrics uncover localized performance bottlenecks.

02

Method is scalable and effective on real-world applications.

03

Handles common trace anomalies robustly.

Abstract

Detailed trace analysis of MPI applications is essential for performance engineering, but growing trace sizes and complex communication behaviour often render comprehensive visual inspection impractical. This work presents a trace-based calculation of time-resolved values of standard MPI performance metrics, load balance, serialisation, and transfer efficiency, by discretising execution traces into fixed or adaptive time segments. The implementation processes Paraver traces postmortem, reconstructing critical execution paths and handling common event anomalies, such as clock inconsistencies and unmatched MPI events, to robustly calculate metrics for each segment. The calculated per-window metric values expose transient performance bottlenecks that the timeaggregated metrics from existing tools may conceal. Evaluations on a synthetic benchmark and real-world applications (LaMEM and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsParallel Computing and Optimization Techniques · Real-Time Systems Scheduling · Network Time Synchronization Technologies