Theory of high-dimensional outliers
Hyo Young Choi, J. S. Marron

TL;DR
This paper introduces a new framework for understanding high-dimensional outliers, revealing their geometric behavior and analyzing PCA robustness in the presence of few outliers, which advances outlier detection theory.
Contribution
It proposes a novel notion of high-dimensional outliers, explores their geometric transition phenomena, and examines PCA subspace consistency under limited outlier contamination.
Findings
Outliers transition from near the sphere surface to being distant as dimension grows.
Geometric properties of high-dimensional outliers exhibit a transition phenomenon.
PCA subspace consistency is studied with limited outliers present.
Abstract
This study concerns the issue of high dimensional outliers which are challenging to distinguish from inliers due to the special structure of high dimensional space. We introduce a new notion of high dimensional outliers that embraces various types and provides deep insights into understanding the behavior of these outliers based on several asymptotic regimes. Our study of geometrical properties of high dimensional outliers reveals an interesting transition phenomenon of outliers from near the surface of a high dimensional sphere to being distant from the sphere. Also, we study the PCA subspace consistency when data contain a limited number of outliers.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Advanced Statistical Methods and Models · Financial Risk and Volatility Modeling
