BiDAl: Big Data Analyzer for Cluster Traces
Alkida Balliu, Dennis Olivetti, Ozalp Babaoglu, Moreno Marzolla, Alina, S\^irbu

TL;DR
BiDAl is a modular framework that leverages Big Data technologies to facilitate the analysis of large-scale cluster logs, aiding in understanding data center behavior and improving management.
Contribution
It introduces a flexible, extensible log analysis framework supporting multiple storage and analysis tools, tailored for large data center trace analysis.
Findings
Successfully analyzed Google cluster traces
Built a simulation model reproducing observed behaviors
Demonstrated ease of integrating various data analysis tools
Abstract
Modern data centers that provide Internet-scale services are stadium-size structures housing tens of thousands of heterogeneous devices (server clusters, networking equipment, power and cooling infrastructures) that must operate continuously and reliably. As part of their operation, these devices produce large amounts of data in the form of event and error logs that are essential not only for identifying problems but also for improving data center efficiency and management. These activities employ data analytics and often exploit hidden statistical patterns and correlations among different factors present in the data. Uncovering these patterns and correlations is challenging due to the sheer volume of data to be analyzed. This paper presents BiDAl, a prototype "log-data analysis framework" that incorporates various Big Data technologies to simplify the analysis of data traces from large…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · Graph Theory and Algorithms · Advanced Data Storage Technologies
