Outlier Detection Using Vector Cosine Similarity by Adding a Dimension
Zhongyang Shen

TL;DR
This paper introduces a novel outlier detection technique using vector cosine similarity in a modified dataset with an added zero-value dimension, enabling effective identification of anomalies in multi-dimensional data.
Contribution
The paper presents a new outlier detection method based on cosine similarity with a dataset augmentation approach, and provides an optimized implementation called MDOD.
Findings
Effective outlier detection demonstrated in multi-dimensional data
Implementation available on PyPI for practical use
Method improves detection accuracy over traditional techniques
Abstract
We propose a new outlier detection method for multi-dimensional data. The method detects outliers based on vector cosine similarity, using a new dataset constructed by adding a dimension with zero values to the original data. When a point in the new dataset is selected as the measured point, an observation point is created as the origin, differing only in the new dimension by having a non-zero value compared to the measured point. Vectors are then formed from the observation point to the measured point and to other points in the dataset. By comparing the cosine similarities of these vectors, abnormal data can be identified. An optimized implementation (MDOD) is available on PyPI: https://pypi.org/project/mdod/.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Advanced Statistical Methods and Models · Time Series Analysis and Forecasting
