Autoencoders for unsupervised anomaly detection in high energy physics
Thorben Finke, Michael Kr\"amer, Alessandro Morandini, Alexander, M\"uck, Ivan Oleksiyuk

TL;DR
This paper evaluates the effectiveness and limitations of autoencoders for unsupervised anomaly detection in high energy physics, specifically in jet image tagging, highlighting challenges in model independence and proposing improved metrics.
Contribution
It demonstrates the limitations of standard autoencoders for model-independent anomaly detection in jet images and suggests improved performance measures and training strategies.
Findings
Autoencoders can reproduce positive results but fail in inverted tasks due to data sparsity.
Standard autoencoders are not reliable as model-independent anomaly taggers.
Enhanced autoencoder training can improve feature learning for specific tagging tasks.
Abstract
Autoencoders are widely used in machine learning applications, in particular for anomaly detection. Hence, they have been introduced in high energy physics as a promising tool for model-independent new physics searches. We scrutinize the usage of autoencoders for unsupervised anomaly detection based on reconstruction loss to show their capabilities, but also their limitations. As a particle physics benchmark scenario, we study the tagging of top jet images in a background of QCD jet images. Although we reproduce the positive results from the literature, we show that the standard autoencoder setup cannot be considered as a model-independent anomaly tagger by inverting the task: due to the sparsity and the specific structure of the jet images, the autoencoder fails to tag QCD jets if it is trained on top jets even in a semi-supervised setup. Since the same autoencoder architecture can be…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
