Meta-Learning for Automated Selection of Anomaly Detectors for Semi-Supervised Datasets
David Schubert, Pritha Gupta, Marcel Wever

TL;DR
This paper proposes a meta-learning approach to automatically select the most suitable anomaly detector for semi-supervised datasets by predicting performance metrics using only normal data.
Contribution
It introduces a novel meta-learning framework that predicts detector performance metrics from normal data features, enabling automated detector selection in semi-supervised anomaly detection.
Findings
Meta-features like hypervolume and false positive rate are promising for performance prediction.
The approach facilitates automated anomaly detector selection without access to labeled anomalies.
Initial results show potential for improving semi-supervised anomaly detection workflows.
Abstract
In anomaly detection, a prominent task is to induce a model to identify anomalies learned solely based on normal data. Generally, one is interested in finding an anomaly detector that correctly identifies anomalies, i.e., data points that do not belong to the normal class, without raising too many false alarms. Which anomaly detector is best suited depends on the dataset at hand and thus needs to be tailored. The quality of an anomaly detector may be assessed via confusion-based metrics such as the Matthews correlation coefficient (MCC). However, since during training only normal data is available in a semi-supervised setting, such metrics are not accessible. To facilitate automated machine learning for anomaly detectors, we propose to employ meta-learning to predict MCC scores based on metrics that can be computed with normal data only. First promising results can be obtained…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Machine Learning and Data Classification · Network Security and Intrusion Detection
