CHASE: A Causal Hypergraph based Framework for Root Cause Analysis in Multimodal Microservice Systems
Ziming Zhao, Zhenwei Wang, Tiehua Zhang, Zhishu Shen, Hai Dong, Zhen, Lei, Xingjun Ma, Gaowei Xu, Zhijun Ding, Yun Yang

TL;DR
CHASE is a novel framework that uses a causal hypergraph approach to efficiently identify root causes of anomalies in complex multimodal microservice systems, improving accuracy over existing methods.
Contribution
The paper introduces CHASE, a new hypergraph-based framework that models causality in multimodal data for root cause analysis in microservices.
Findings
CHASE outperforms state-of-the-art methods with up to 36.2% accuracy gain.
The framework effectively integrates traces, logs, and metrics for anomaly detection.
Experimental results validate CHASE's superior performance on real datasets.
Abstract
In recent years, the widespread adoption of distributed microservice architectures within the industry has significantly increased the demand for enhanced system availability and robustness. Due to the complex service invocation paths and dependencies in enterprise-level microservice systems, it is challenging to locate the anomalies promptly during service invocations, thus causing intractable issues for normal system operations and maintenance. In this paper, we propose a Causal Heterogeneous grAph baSed framEwork for root cause analysis, namely CHASE, for microservice systems with multimodal data, including traces, logs, and system monitoring metrics. Specifically, related information is encoded into representative embeddings and further modeled by a multimodal invocation graph. Following that, anomaly detection is performed on each instance node with attentive heterogeneous message…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware System Performance and Reliability · Service-Oriented Architecture and Web Services · Network Security and Intrusion Detection
Methodstravel james
