Scalable Comparative Visualization of Ensembles of Call Graphs
Suraj P.Kesavan, Harsh Bhatia, Abhinav Bhatele, Todd Gamblin,, Peer-Timo Bremer, Kwan-Liu Ma

TL;DR
This paper introduces an enhanced visualization tool called CallFlow for exploring ensembles of call graphs, combining Sankey diagrams and box plots to analyze structural differences and performance variability in large-scale parallel code performance profiles.
Contribution
The paper presents a novel visualization method, ensemble-Sankey, and an interactive interface for analyzing multiple call graphs simultaneously, improving understanding of performance variability and structural differences.
Findings
Effective visualization of large call graph ensembles demonstrated
Facilitates identification of performance bottlenecks and structural differences
Case studies show improved analysis capabilities
Abstract
Optimizing the performance of large-scale parallel codes is critical for efficient utilization of computing resources. Code developers often explore various execution parameters, such as hardware configurations, system software choices, and application parameters, and are interested in detecting and understanding bottlenecks in different executions. They often collect hierarchical performance profiles represented as call graphs, which combine performance metrics with their execution contexts. The crucial task of exploring multiple call graphs together is tedious and challenging because of the many structural differences in the execution contexts and significant variability in the collected performance metrics (e.g., execution runtime). In this paper, we present an enhanced version of CallFlow to support the exploration of ensembles of call graphs using new types of visualizations,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware System Performance and Reliability · Cloud Computing and Resource Management · Parallel Computing and Optimization Techniques
