Collection and harmonization of system logs and prototypal Analytics services with the Elastic (ELK) suite at the INFN-CNAF computing centre
Tommaso Diotalevi, Antonio Falabella, Barbara Martelli, Diego, Michelotto, Lucia Morganti, Daniele Bonacorsi, Luca Giommi, Simone Rossi, Tisbeni

TL;DR
This paper presents a system for collecting, harmonizing, and analyzing heterogeneous system logs at the INFN-CNAF computing center, utilizing the Elastic (ELK) suite, and explores a machine learning approach for predictive maintenance.
Contribution
It introduces a practical implementation of log collection and harmonization using ELK and investigates a novel machine learning-based predictive maintenance system.
Findings
Effective log collection and parsing from diverse sources.
Successful deployment of a predictive maintenance prototype.
Enhanced monitoring capabilities at the INFN-CNAF center.
Abstract
The distributed Grid infrastructure for High Energy Physics experiments at the Large Hadron Collider (LHC) in Geneva comprises a set of computing centres, spread all over the world, as part of the Worldwide LHC Computing Grid (WLCG). In Italy, the Tier-1 functionalities are served by the INFN-CNAF data center, which provides also computing and storage resources to more than twenty non-LHC experiments. For this reason, a high amount of logs are collected each day from various sources, which are highly heterogeneous and difficult to harmonize. In this contribution, a working implementation of a system that collects, parses and displays the log information from CNAF data sources and the investigation of a Machine Learning based predictive maintenance system, is presented.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
