RECol: Reconstruction Error Columns for Outlier Detection
J\"orn Hees, Dayananda Herurkar, Mario Meier

TL;DR
RECol is a novel pre-processing technique that creates reconstruction error columns to enhance outlier detection by capturing feature relationships often missed by traditional methods.
Contribution
It introduces a generic leave-one-out reconstruction error feature generation method that improves the effectiveness of existing outlier detection algorithms.
Findings
RECol improves ROC-AUC and PR-AUC in outlier detection tasks.
The method supports various baseline approaches across multiple datasets.
Reconstruction error columns capture feature relationships missed by standard approaches.
Abstract
Detecting outliers or anomalies is a common data analysis task. As a sub-field of unsupervised machine learning, a large variety of approaches exist, but the vast majority treats the input features as independent and often fails to recognize even simple (linear) relationships in the input feature space. Hence, we introduce RECol, a generic data pre-processing approach to generate additional columns in a leave-one-out-fashion: For each column, we try to predict its values based on the other columns, generating reconstruction error columns. We run experiments across a large variety of common baseline approaches and benchmark datasets with and without our RECol pre-processing method and show that the generated reconstruction error feature space generally seems to support common outlier detection methods and often considerably improves their ROC-AUC and PR-AUC values.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Adversarial Robustness in Machine Learning · Fault Detection and Control Systems
