Advancing ATLAS DCS Data Analysis with a Modern Data Platform
Luca Canali, Andrea Formica, Michelle Ann Solis

TL;DR
This paper introduces a scalable, modern data analysis framework for ATLAS DCS data, leveraging Apache Spark and CERN's SWAN platform to improve efficiency and troubleshooting capabilities.
Contribution
It presents a novel data pipeline integrating industry-standard tools for large-scale analysis of ATLAS DCS data, enhancing troubleshooting and operational efficiency.
Findings
Effective in troubleshooting ATLAS NSW detector issues
Seamless integration with Python notebooks
Improved data analysis efficiency for large datasets
Abstract
This paper presents a modern and scalable framework for analyzing Detector Control System (DCS) data from the ATLAS experiment at CERN. The DCS data, stored in an Oracle database via the WinCC OA system, is optimized for transactional operations, posing challenges for large-scale analysis across extensive time periods and devices. To address these limitations, we developed a data pipeline using Apache Spark, CERN's Hadoop service, and the CERN SWAN platform. This framework integrates seamlessly with Python notebooks, providing an accessible and efficient environment for data analysis using industry-standard tools. The approach has proven effective in troubleshooting Data Acquisition (DAQ) links for the ATLAS New Small Wheel (NSW) detector, demonstrating the value of modern data platforms in enabling detector experts to quickly identify and resolve critical issues.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
