Autoencoders for Anomaly Detection are Unreliable

Roel Bouman; Tom Heskes

arXiv:2501.13864·cs.LG·January 24, 2025·2 cites

Autoencoders for Anomaly Detection are Unreliable

Roel Bouman, Tom Heskes

PDF

Open Access 4 Reviews

TL;DR

This paper demonstrates that autoencoders, commonly used for anomaly detection, can fail by perfectly reconstructing anomalies, challenging their reliability and highlighting potential safety risks.

Contribution

It provides a theoretical and empirical analysis showing autoencoders can reconstruct anomalies accurately, undermining their effectiveness for anomaly detection.

Findings

01

Linear autoencoders can perfectly reconstruct out-of-distribution data

02

Autoencoders may extrapolate undesirably, leading to false negatives

03

Reconstruction failure assumptions do not always hold in practice

Abstract

Autoencoders are frequently used for anomaly detection, both in the unsupervised and semi-supervised settings. They rely on the assumption that when trained using the reconstruction loss, they will be able to reconstruct normal data more accurately than anomalous data. Some recent works have posited that this assumption may not always hold, but little has been done to study the validity of the assumption in theory. In this work we show that this assumption indeed does not hold, and illustrate that anomalies, lying far away from normal data, can be perfectly reconstructed in practice. We revisit the theory of failure of linear autoencoders for anomaly detection by showing how they can perfectly reconstruct out of bounds, or extrapolate undesirably, and note how this can be dangerous in safety critical applications. We connect this to non-linear autoencoders through experiments on both…

Peer Reviews

Decision·ICLR 2025 Conference Withdrawn Submission

Reviewer 01Rating 6Confidence 3

Strengths

- The paper presents a novel and intriguing perspective by questioning the use of autoencoders for anomaly detection theoretically. - Analysis is conducted through both theoretical exploration and practical examples using synthetic data and MNIST. - Figure 1 results are well-illustrated and offer interesting findings.

Weaknesses

- Although this work highlights the limitations of autoencoders for anomaly detection, it may not fully address practical cases where anomalies come from out-of-class data or shifts in distribution, which differ from the conditions presented. For instance, in Figure 2, some examples show failure cases in convolutional autoencoders, but in many regions, reconstruction errors appropriately increase as the distribution shift grows. Specifically, in Figure 2(a), the presence of low reconstruction er

Reviewer 02Rating 3Confidence 4

Strengths

+ Mathematical notations and flow are sound.

Weaknesses

- Discussed in the summary.

Reviewer 03Rating 3Confidence 5

Strengths

* The paper discusses the under-appreciated problem of the autoencoder being capable of generating anomalies when applied to anomaly detection. * The overall exposition of the paper is clear and easy to follow.

Weaknesses

* The paper only reports the problem but not a solution. The contribution of the paper is questionable, as the unexpected reconstruction of anomalies by an autoencoder was mentioned and studied several times in previous works. According to line 427 of the manuscript, this work is not the first to report the reconstruction of anomalies. * There are missing references that reported and discussed the anomaly reconstruction phenomenon. * https://uvadlc-notebooks.readthedocs.io/en/latest/tutorial

Reviewer 04Rating 6Confidence 3

Strengths

- This paper theoretically and visually demonstrates that autoencoders can reconstruct data that is not included in the training dataset.

Weaknesses

- I am unclear about the novelty of this paper. Please refer to the Questions section.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications