Autoencoders for Anomaly Detection are Unreliable
Roel Bouman, Tom Heskes

TL;DR
This paper demonstrates that autoencoders, commonly used for anomaly detection, can fail by perfectly reconstructing anomalies, challenging their reliability and highlighting potential safety risks.
Contribution
It provides a theoretical and empirical analysis showing autoencoders can reconstruct anomalies accurately, undermining their effectiveness for anomaly detection.
Findings
Linear autoencoders can perfectly reconstruct out-of-distribution data
Autoencoders may extrapolate undesirably, leading to false negatives
Reconstruction failure assumptions do not always hold in practice
Abstract
Autoencoders are frequently used for anomaly detection, both in the unsupervised and semi-supervised settings. They rely on the assumption that when trained using the reconstruction loss, they will be able to reconstruct normal data more accurately than anomalous data. Some recent works have posited that this assumption may not always hold, but little has been done to study the validity of the assumption in theory. In this work we show that this assumption indeed does not hold, and illustrate that anomalies, lying far away from normal data, can be perfectly reconstructed in practice. We revisit the theory of failure of linear autoencoders for anomaly detection by showing how they can perfectly reconstruct out of bounds, or extrapolate undesirably, and note how this can be dangerous in safety critical applications. We connect this to non-linear autoencoders through experiments on both…
Peer Reviews
Decision·ICLR 2025 Conference Withdrawn Submission
- The paper presents a novel and intriguing perspective by questioning the use of autoencoders for anomaly detection theoretically. - Analysis is conducted through both theoretical exploration and practical examples using synthetic data and MNIST. - Figure 1 results are well-illustrated and offer interesting findings.
- Although this work highlights the limitations of autoencoders for anomaly detection, it may not fully address practical cases where anomalies come from out-of-class data or shifts in distribution, which differ from the conditions presented. For instance, in Figure 2, some examples show failure cases in convolutional autoencoders, but in many regions, reconstruction errors appropriately increase as the distribution shift grows. Specifically, in Figure 2(a), the presence of low reconstruction er
+ Mathematical notations and flow are sound.
- Discussed in the summary.
* The paper discusses the under-appreciated problem of the autoencoder being capable of generating anomalies when applied to anomaly detection. * The overall exposition of the paper is clear and easy to follow.
* The paper only reports the problem but not a solution. The contribution of the paper is questionable, as the unexpected reconstruction of anomalies by an autoencoder was mentioned and studied several times in previous works. According to line 427 of the manuscript, this work is not the first to report the reconstruction of anomalies. * There are missing references that reported and discussed the anomaly reconstruction phenomenon. * https://uvadlc-notebooks.readthedocs.io/en/latest/tutorial
- This paper theoretically and visually demonstrates that autoencoders can reconstruct data that is not included in the training dataset.
- I am unclear about the novelty of this paper. Please refer to the Questions section.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications
