High-dimensional outlier detection using random projections
P. Navarro-Esteban, J. A. Cuesta-Albertos

TL;DR
This paper introduces a new high-dimensional outlier detection method using random projections and univariate detection techniques, avoiding covariance matrix estimation in high dimensions.
Contribution
It proposes a novel random projections-based procedure for outlier detection in high-dimensional Gaussian data, eliminating the need for covariance matrix estimation.
Findings
Effective in high-dimensional settings
Performs well on simulated datasets
Validated on real datasets
Abstract
There exist multiple methods to detect outliers in multivariate data in the literature, but most of them require to estimate the covariance matrix. The higher the dimension, the more complex the estimation of the matrix becoming impossible in high dimensions. In order to avoid estimating this matrix, we propose a novel random projections-based procedure to detect outliers in Gaussian multivariate data. It consists in projecting the data in several one-dimensional subspaces where an appropriate univariate outlier detection method, similar to Tukey's method but with a threshold depending on the initial dimension and the sample size, is applied. The required number of projections is determined using sequential analysis. Simulated and real datasets illustrate the performance of the proposed method.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Statistical Methods and Models · Advanced Statistical Process Monitoring · Anomaly Detection Techniques and Applications
