CloudHeatMap: Heatmap-Based Monitoring for Large-Scale Cloud Systems
Sarah Sohana, William Pourmajidi, John Steinbacher, Andriy Miranskyy

TL;DR
CloudHeatMap introduces a heatmap visualization tool that enhances real-time monitoring of large-scale cloud systems, enabling quick identification of performance issues through intuitive visualizations of key metrics.
Contribution
The paper presents CloudHeatMap, a novel heatmap-based visualization approach specifically designed for monitoring the health of large-scale cloud systems in near-real-time.
Findings
Improves detection of performance issues in LCS
Enhances operational decision-making
Proven effectiveness in IBM Cloud case study
Abstract
Cloud computing is essential for modern enterprises, requiring robust tools to monitor and manage Large-Scale Cloud Systems (LCS). Traditional monitoring tools often miss critical insights due to the complexity and volume of LCS telemetry data. This paper presents CloudHeatMap, a novel heatmap-based visualization tool for near-real-time monitoring of LCS health. It offers intuitive visualizations of key metrics, such as call volumes and response times, enabling operators to quickly identify performance issues. A case study on the IBM Cloud Console demonstrates the tool's effectiveness in enhancing operational monitoring and decision-making.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · Software System Performance and Reliability · Scientific Computing and Data Management
