TL;DR
This paper systematically evaluates deep learning methods for multivariate time series anomaly detection, introduces a new composite F-score metric, and finds that simple models with dynamic scoring outperform complex state-of-the-art techniques.
Contribution
It provides a comprehensive comparison of models and scoring functions, introduces a new evaluation metric, and identifies a simple auto-encoder with dynamic scoring as the best approach.
Findings
Dynamic scoring functions outperform static ones.
The composite F-score ($Fc_1$) effectively evaluates event detection.
A simple univariate auto-encoder with dynamic Gaussian scoring is highly effective.
Abstract
Several techniques for multivariate time series anomaly detection have been proposed recently, but a systematic comparison on a common set of datasets and metrics is lacking. This paper presents a systematic and comprehensive evaluation of unsupervised and semi-supervised deep-learning based methods for anomaly detection and diagnosis on multivariate time series data from cyberphysical systems. Unlike previous works, we vary the model and post-processing of model errors, i.e. the scoring functions independently of each other, through a grid of 10 models and 4 scoring functions, comparing these variants to state of the art methods. In time-series anomaly detection, detecting anomalous events is more important than detecting individual anomalous time-points. Through experiments, we find that the existing evaluation metrics either do not take events into account, or cannot distinguish…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
