TL;DR
ADSketch is an interpretable, adaptive anomaly detection method for online service systems that identifies performance issues through pattern sketching, effectively handling evolving service behaviors and outperforming existing approaches.
Contribution
This paper introduces ADSketch, a novel pattern sketching-based, interpretable, and adaptive anomaly detection method tailored for online service systems, addressing interpretability and adaptability challenges.
Findings
ADSketch outperforms state-of-the-art methods significantly.
The online algorithm effectively discovers new patterns.
Successfully deployed in industrial practice.
Abstract
To ensure the performance of online service systems, their status is closely monitored with various software and system metrics. Performance anomalies represent the performance degradation issues (e.g., slow response) of the service systems. When performing anomaly detection over the metrics, existing methods often lack the merit of interpretability, which is vital for engineers and analysts to take remediation actions. Moreover, they are unable to effectively accommodate the ever-changing services in an online fashion. To address these limitations, in this paper, we propose ADSketch, an interpretable and adaptive performance anomaly detection approach based on pattern sketching. ADSketch achieves interpretability by identifying groups of anomalous metric patterns, which represent particular types of performance issues. The underlying issues can then be immediately recognized if similar…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
