Robust Statistical Scaling of Outlier Scores: Improving the Quality of Outlier Probabilities for Outliers (Extended Version)
Philipp R\"ochner, Henrique O. Marques, Ricardo J. G. B. Campello,, Arthur Zimek, Franz Rothlauf

TL;DR
This paper introduces robust statistical scaling to transform outlier scores into probabilities, enhancing interpretability and accuracy for outliers, especially in critical applications like healthcare and finance.
Contribution
It proposes a robust scaling method that improves outlier probability estimates for outliers, addressing limitations of existing statistical scaling techniques.
Findings
Improved outlier probability accuracy for real-world datasets
Robust scaling outperforms traditional methods in detecting outliers
Enhances interpretability and comparability of outlier scores
Abstract
Outlier detection algorithms typically assign an outlier score to each observation in a dataset, indicating the degree to which an observation is an outlier. However, these scores are often not comparable across algorithms and can be difficult for humans to interpret. Statistical scaling addresses this problem by transforming outlier scores into outlier probabilities without using ground-truth labels, thereby improving interpretability and comparability across algorithms. However, the quality of this transformation can be different for outliers and inliers. Missing outliers in scenarios where they are of particular interest - such as healthcare, finance, or engineering - can be costly or dangerous. Thus, ensuring good probabilities for outliers is essential. This paper argues that statistical scaling, as commonly used in the literature, does not produce equally good probabilities for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Statistical Methods and Models
