RCAEval: A Benchmark for Root Cause Analysis of Microservice Systems with Telemetry Data
Luan Pham, Hongyu Zhang, Huong Ha, Flora Salim, Xiuzhen Zhang

TL;DR
RCAEval is an open-source benchmark providing datasets and evaluation tools for root cause analysis in microservice systems, facilitating standardized assessment of various RCA approaches on real-world failure data.
Contribution
It introduces the first comprehensive benchmark with large-scale datasets and evaluation framework for RCA in microservice systems, supporting diverse approaches.
Findings
Datasets include 735 failure cases from real microservice systems.
Evaluation framework covers 15 RCA baselines for different granularity levels.
Benchmark enables standardized, extensive analysis of RCA methods.
Abstract
Root cause analysis (RCA) for microservice systems has gained significant attention in recent years. However, there is still no standard benchmark that includes large-scale datasets and supports comprehensive evaluation environments. In this paper, we introduce RCAEval, an open-source benchmark that provides datasets and an evaluation environment for RCA in microservice systems. First, we introduce three comprehensive datasets comprising 735 failure cases collected from three microservice systems, covering various fault types observed in real-world failures. Second, we present a comprehensive evaluation framework that includes fifteen reproducible baselines covering a wide range of RCA approaches, with the ability to evaluate both coarse-grained and fine-grained RCA. We hope that this ready-to-use benchmark will enable researchers and practitioners to conduct extensive analysis and pave…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware System Performance and Reliability · IoT and Edge/Fog Computing · Service-Oriented Architecture and Web Services
