Robust Subspace Outlier Detection in High Dimensional Space

Zhana Bao

arXiv:1405.0869·cs.AI·May 6, 2014·2 cites

Robust Subspace Outlier Detection in High Dimensional Space

Zhana Bao

PDF

Open Access

TL;DR

This paper introduces a robust subspace method for detecting hidden outliers in high-dimensional data by combining local density ratios and neighbor position comparisons across two projections, achieving high precision.

Contribution

It proposes a novel two-step subspace outlier detection approach that effectively identifies hidden outliers in very high dimensional spaces.

Findings

01

High precision in extremely high-dimensional spaces

02

Effective detection of hidden outliers within normal clusters

03

Works well across a range of dimensions from 10 to 10,000

Abstract

Rare data in a large-scale database are called outliers that reveal significant information in the real world. The subspace-based outlier detection is regarded as a feasible approach in very high dimensional space. However, the outliers found in subspaces are only part of the true outliers in high dimensional space, indeed. The outliers hidden in normal-clustered points are sometimes neglected in the projected dimensional subspace. In this paper, we propose a robust subspace method for detecting such inner outliers in a given dataset, which uses two dimensional-projections: detecting outliers in subspaces with local density ratio in the first projected dimensions; finding outliers by comparing neighbor's positions in the second projected dimensions. Each point's weight is calculated by summing up all related values got in the two steps projected dimensions, and then the points scoring…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Advanced Statistical Methods and Models · Artificial Immune Systems Applications