Deep Anomaly Detection under Labeling Budget Constraints
Aodong Li, Chen Qiu, Marius Kloft, Padhraic Smyth, Stephan Mandt, Maja, Rudolph

TL;DR
This paper introduces a theoretical framework and a new semi-supervised learning approach for anomaly detection that optimizes data labeling under budget constraints, achieving state-of-the-art results across multiple data types.
Contribution
It provides theoretical conditions for anomaly score generalization and proposes an optimal data labeling strategy combined with a semi-supervised learning framework.
Findings
Achieves state-of-the-art semi-supervised anomaly detection performance
Develops a theoretical understanding of anomaly score generalization
Proposes an optimal labeling strategy under budget constraints
Abstract
Selecting informative data points for expert feedback can significantly improve the performance of anomaly detection (AD) in various contexts, such as medical diagnostics or fraud detection. In this paper, we determine a set of theoretical conditions under which anomaly scores generalize from labeled queries to unlabeled data. Motivated by these results, we propose a data labeling strategy with optimal data coverage under labeling budget constraints. In addition, we propose a new learning framework for semi-supervised AD. Extensive experiments on image, tabular, and video data sets show that our approach results in state-of-the-art semi-supervised AD performance under labeling budget constraints.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Data-Driven Disease Surveillance · Artificial Immune Systems Applications
