Generating Artificial Outliers in the Absence of Genuine Ones -- a   Survey

Georg Steinbuss; Klemens B\"ohm

arXiv:2006.03646·cs.LG·May 7, 2021

Generating Artificial Outliers in the Absence of Genuine Ones -- a Survey

Georg Steinbuss, Klemens B\"ohm

PDF

TL;DR

This survey reviews and compares various methods for generating artificial outliers, highlighting their differences, experimental performance, and potential for guiding future research in outlier detection.

Contribution

It systematically categorizes and analyzes existing approaches for artificial outlier generation, providing a comprehensive framework and decision process for selecting suitable methods.

Findings

01

Generation quality varies widely across approaches

02

Existing methods cover diverse concepts but have room for improvement

03

Experimental results highlight differences in effectiveness depending on data sets

Abstract

By definition, outliers are rarely observed in reality, making them difficult to detect or analyse. Artificial outliers approximate such genuine outliers and can, for instance, help with the detection of genuine outliers or with benchmarking outlier-detection algorithms. The literature features different approaches to generate artificial outliers. However, systematic comparison of these approaches remains absent. This surveys and compares these approaches. We start by clarifying the terminology in the field, which varies from publication to publication, and we propose a general problem formulation. Our description of the connection of generating outliers to other research fields like experimental design or generative models frames the field of artificial outliers. Along with offering a concise description, we group the approaches by their general concepts and how they make use of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.