A Multi-Level, Multi-Scale Visual Analytics Approach to Assessment of Multifidelity HPC Systems
Shilpika, Bethany Lusch, Murali Emani, Filippo Simini, Venkatram, Vishwanath, Michael E. Papka, and Kwan-Liu Ma

TL;DR
This paper presents a multi-level, multi-scale visual analytics system that leverages advanced data decomposition techniques to analyze massive supercomputing logs, aiding in the detection of usage and error patterns across complex HPC systems.
Contribution
It introduces an integrated analytical framework combining multiresolution dynamic mode decomposition with visual analytics for comprehensive supercomputer log analysis.
Findings
Effective extraction of spatial-temporal patterns from large log datasets
Identification of usage and error patterns at multiple system levels
Demonstrated success with Cray XC40 supercomputer scenarios
Abstract
The ability to monitor and interpret of hardware system events and behaviors are crucial to improving the robustness and reliability of these systems, especially in a supercomputing facility. The growing complexity and scale of these systems demand an increase in monitoring data collected at multiple fidelity levels and varying temporal resolutions. In this work, we aim to build a holistic analytical system that helps make sense of such massive data, mainly the hardware logs, job logs, and environment logs collected from disparate subsystems and components of a supercomputer system. This end-to-end log analysis system, coupled with visual analytics support, allows users to glean and promptly extract supercomputer usage and error patterns at varying temporal and spatial resolutions. We use multiresolution dynamic mode decomposition (mrDMD), a technique that depicts high-dimensional data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Visualization and Analytics · Cell Image Analysis Techniques · Anomaly Detection Techniques and Applications
MethodsVisual Analytics
