Null Hypothesis Test for Anomaly Detection
Jernej F. Kamenik, Manuel Szewc

TL;DR
This paper introduces a hypothesis testing method for anomaly detection that assesses statistical independence between dataset regions, improving robustness and avoiding fixed thresholds, demonstrated on LHC Olympics data.
Contribution
It extends classification without labels by incorporating a hypothesis test for background-only hypothesis exclusion, leveraging mutual information and decorrelation techniques.
Findings
The method effectively detects anomalies across various signal fractions.
It maintains high performance even with realistic feature correlations.
Mutual information is a suitable test statistic for independence.
Abstract
We extend the use of Classification Without Labels for anomaly detection with a hypothesis test designed to exclude the background-only hypothesis. By testing for statistical independence of the two discriminating dataset regions, we are able to exclude the background-only hypothesis without relying on fixed anomaly score cuts or extrapolations of background estimates between regions. The method relies on the assumption of conditional independence of anomaly score features and dataset regions, which can be ensured using existing decorrelation techniques. As a benchmark example, we consider the LHC Olympics dataset where we show that mutual information represents a suitable test for statistical independence and our method exhibits excellent and robust performance at different signal fractions even in presence of realistic feature correlations.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Machine Learning and Data Classification
MethodsTest
