Understanding the Effect of Bias in Deep Anomaly Detection
Ziyu Ye, Yuxin Chen, Haitao Zheng

TL;DR
This paper investigates how bias in labeled anomaly data affects deep anomaly detection models, providing theoretical analysis, empirical validation, and insights into when bias can be beneficial or harmful.
Contribution
It introduces the first finite sample bounds for estimating scoring bias and offers an extensive empirical study on bias effects in deep anomaly detection.
Findings
Biased anomaly sets can both help and hinder detection performance.
Theoretical bounds for bias estimation are established and validated.
Bias impacts different anomaly classes in varied ways.
Abstract
Anomaly detection presents a unique challenge in machine learning, due to the scarcity of labeled anomaly data. Recent work attempts to mitigate such problems by augmenting training of deep anomaly detection models with additional labeled anomaly samples. However, the labeled data often does not align with the target distribution and introduces harmful bias to the trained model. In this paper, we aim to understand the effect of a biased anomaly set on anomaly detection. Concretely, we view anomaly detection as a supervised learning task where the objective is to optimize the recall at a given false positive rate. We formally study the relative scoring bias of an anomaly detector, defined as the difference in performance with respect to a baseline anomaly detector. We establish the first finite sample rates for estimating the relative scoring bias for deep anomaly detection, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Network Security and Intrusion Detection · Imbalanced Data Classification Techniques
