DQE: A Semantic-Aware Evaluation Metric for Time Series Anomaly Detection
Yuewei Li, Dalin Zhang, Huan Li, Xinyi Gong, Hongjun Chu, Zhaohui Song

TL;DR
This paper introduces DQE, a semantic-aware evaluation metric for time series anomaly detection that addresses limitations of existing metrics by providing more reliable, interpretable, and comprehensive assessment across detection semantics.
Contribution
The paper proposes a novel evaluation metric, DQE, based on detection semantics and a partitioning strategy, improving robustness and interpretability over existing metrics.
Findings
DQE offers stable and discriminative evaluation results.
It eliminates threshold-interval bias in detection assessment.
Extensive experiments confirm DQE's robustness and interpretability.
Abstract
Time series anomaly detection has achieved remarkable progress in recent years. However, evaluation practices have received comparatively less attention, despite their critical importance. Existing metrics exhibit several limitations: (1) bias toward point-level coverage, (2) insensitivity or inconsistency in near-miss detections, (3) inadequate penalization of false alarms, and (4) inconsistency caused by threshold or threshold-interval selection. These limitations can produce unreliable or counterintuitive results, hindering objective progress. In this work, we revisit the evaluation of time series anomaly detection from the perspective of detection semantics and propose a novel metric for more comprehensive assessment. We first introduce a partitioning strategy grounded in detection semantics, which decomposes the local temporal region of each anomaly into three functionally distinct…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Time Series Analysis and Forecasting · Software System Performance and Reliability
