Generating Artificial Outliers in the Absence of Genuine Ones -- a Survey
Georg Steinbuss, Klemens B\"ohm

TL;DR
This survey reviews and compares various methods for generating artificial outliers, highlighting their differences, experimental performance, and potential for guiding future research in outlier detection.
Contribution
It systematically categorizes and analyzes existing approaches for artificial outlier generation, providing a comprehensive framework and decision process for selecting suitable methods.
Findings
Generation quality varies widely across approaches
Existing methods cover diverse concepts but have room for improvement
Experimental results highlight differences in effectiveness depending on data sets
Abstract
By definition, outliers are rarely observed in reality, making them difficult to detect or analyse. Artificial outliers approximate such genuine outliers and can, for instance, help with the detection of genuine outliers or with benchmarking outlier-detection algorithms. The literature features different approaches to generate artificial outliers. However, systematic comparison of these approaches remains absent. This surveys and compares these approaches. We start by clarifying the terminology in the field, which varies from publication to publication, and we propose a general problem formulation. Our description of the connection of generating outliers to other research fields like experimental design or generative models frames the field of artificial outliers. Along with offering a concise description, we group the approaches by their general concepts and how they make use of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
