New Methods and Datasets for Group Anomaly Detection From Fundamental Physics
Gregor Kasieczka, Benjamin Nachman, David Shih

TL;DR
This paper introduces a new benchmark dataset for group anomaly detection inspired by fundamental physics, specifically post-Higgs boson discovery, and evaluates existing methods on this dataset.
Contribution
It presents a realistic synthetic dataset (LHCO2020) for group anomaly detection and compares several existing techniques on this benchmark.
Findings
Existing techniques show varying performance on LHCO2020
The dataset provides a new platform for developing group anomaly detection methods
Unsupervised group anomaly detection remains a challenging problem
Abstract
The identification of anomalous overdensities in data - group or collective anomaly detection - is a rich problem with a large number of real world applications. However, it has received relatively little attention in the broader ML community, as compared to point anomalies or other types of single instance outliers. One reason for this is the lack of powerful benchmark datasets. In this paper, we first explain how, after the Nobel-prize winning discovery of the Higgs boson, unsupervised group anomaly detection has become a new frontier of fundamental physics (where the motivation is to find new particles and forces). Then we propose a realistic synthetic benchmark dataset (LHCO2020) for the development of group anomaly detection algorithms. Finally, we compare several existing statistically-sound techniques for unsupervised group anomaly detection, and demonstrate their performance on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Computational Physics and Python Applications · Particle physics theoretical and experimental studies
