Anomaly Detection for Automated Data Quality Monitoring in the CMS Detector
Andrew Brinkerhoff, Chosila Sutantawibul, Robert White, Caio Daumann, Chad Freer, Indara Suarez, Samuel May, Vivan Nguyen, Jonathan Guiang, Bennett Marsh, Darin Acosta, Alex Aubuchon, Emanuela Barberis, Aaron Bundock, Evan Collins, Preston Epps, Johannes Erdmann

TL;DR
AutoDQM employs advanced statistical and machine learning techniques to detect anomalies in CMS detector data, significantly improving data quality assessment for large particle physics experiments.
Contribution
The paper introduces AutoDQM, a novel automated system combining statistical and machine learning methods for real-time data quality monitoring in particle detectors.
Findings
AutoDQM detects anomalies 4-6 times more effectively than previous methods.
It successfully identifies data affected by detector malfunctions.
The system is applicable to large-scale particle physics data sets.
Abstract
Successful operation of large particle detectors like the Compact Muon Solenoid (CMS) at the CERN Large Hadron Collider requires rapid, in-depth assessment of data quality. We introduce the ``AutoDQM'' system for Automated Data Quality Monitoring using advanced statistical techniques and unsupervised machine learning. Anomaly detection algorithms based on the beta-binomial probability function, principal component analysis, and neural network autoencoder image evaluation are tested on the full set of proton-proton collision data collected by CMS in 2022. AutoDQM identifies anomalous ``bad'' data affected by significant detector malfunction at a rate 4 -- 6 times higher than ``good'' data, demonstrating its effectiveness as a general data quality monitoring tool.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
