Feature space reduction as data preprocessing for the anomaly detection

Simon Bilik; Karel Horak

arXiv:2203.06747·cs.CV·May 1, 2023

Feature space reduction as data preprocessing for the anomaly detection

Simon Bilik, Karel Horak

PDF

Open Access 1 Repo

TL;DR

This paper explores feature space reduction techniques as a preprocessing step for anomaly detection with One Class SVM, comparing autoencoders, PCA, and t-SNE, and finds reconstruction errors to be most effective.

Contribution

It introduces two pipelines combining autoencoders with PCA/t-SNE and demonstrates the robustness of reconstruction error metrics for anomaly detection.

Findings

01

Reconstruction error metrics outperform PCA and t-SNE in robustness.

02

Autoencoder architecture has minimal impact on performance.

03

Approach validated on real-world dataset.

Abstract

In this paper, we present two pipelines in order to reduce the feature space for anomaly detection using the One Class SVM. As a first stage of both pipelines, we compare the performance of three convolutional autoencoders. We use the PCA method together with t-SNE as the first pipeline and the reconstruction errors based method as the second. Both methods have potential for the anomaly detection, but the reconstruction error metrics prove to be more robust for this task. We show that the convolutional autoencoder architecture doesn't have a significant effect for this task and we prove the potential of our approach on the real world dataset.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

boortel/ae-reconstruction-and-feature-based-ad
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Network Security and Intrusion Detection · Artificial Immune Systems Applications

MethodsPrincipal Components Analysis · Support Vector Machine