mTSBench: Benchmarking Multivariate Time Series Anomaly Detection and Model Selection at Scale
Xiaona Zhou, Constantin Brif, Ismini Lourentzou

TL;DR
mTSBench is a comprehensive benchmark for multivariate time series anomaly detection, evaluating 24 detectors across diverse datasets, revealing the need for better model selection strategies.
Contribution
This paper introduces mTSBench, the largest benchmark for multivariate time series anomaly detection and model selection, with extensive evaluation of existing methods and open-source resources.
Findings
No single detector outperforms others across all datasets.
Current model selection methods are far from optimal.
The benchmark encourages future research in robust anomaly detection.
Abstract
Anomaly detection in multivariate time series is essential across domains such as healthcare, cybersecurity, and industrial monitoring, yet remains fundamentally challenging due to high-dimensional dependencies, the presence of cross-correlations between time-dependent variables, and the scarcity of labeled anomalies. We introduce mTSBench, the largest benchmark to date for multivariate time series anomaly detection and model selection, consisting of 344 labeled time series across 19 datasets from a wide range of application domains. We comprehensively evaluate 24 anomaly detectors, including the only two publicly available large language model-based methods for multivariate time series. Consistent with prior findings, we observe that no single detector dominates across datasets, motivating the need for effective model selection. We benchmark three recent model selection methods and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Time Series Analysis and Forecasting
