Deep learning for inferring cause of data anomalies
V. Azzolini, M. Borisyak, G. Cerminara, D. Derkach, G. Franzoni, F. De, Guio, O. Koval, M. Pierini, A. Pol, F. Ratnikov, F. Siroky, A. Ustyuzhanin, and J-R. Vlimant

TL;DR
This paper presents a deep learning approach for identifying specific channels affected by anomalies in large-scale experimental data, reducing the need for detailed labels and aiding data quality monitoring.
Contribution
The paper introduces a novel deep learning model that detects anomalous channels without requiring ground truth labels for each channel, only a global anomaly indicator.
Findings
Successfully applied to CMS data from 2010, demonstrating effective anomaly decomposition.
The model distinguishes affected channels without detailed per-channel labels.
Proves useful for data quality management in large-scale experiments.
Abstract
Daily operation of a large-scale experiment is a resource consuming task, particularly from perspectives of routine data quality monitoring. Typically, data comes from different sub-detectors and the global quality of data depends on the combinatorial performance of each of them. In this paper, the problem of identifying channels in which anomalies occurred is considered. We introduce a generic deep learning model and prove that, under reasonable assumptions, the model learns to identify 'channels' which are affected by an anomaly. Such model could be used for data quality manager cross-check and assistance and identifying good channels in anomalous data samples. The main novelty of the method is that the model does not require ground truth labels for each channel, only global flag is used. This effectively distinguishes the model from classical classification methods. Being applied to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParticle Detector Development and Performance · Particle physics theoretical and experimental studies · Anomaly Detection Techniques and Applications
