Graph Neural AI with Temporal Dynamics for Comprehensive Anomaly Detection in Microservices
Qingyuan Zhang, Ning Lyu, Le Liu, Yuxi Wang, Ziyu Cheng, Cancan Hua

TL;DR
This paper introduces a unified graph neural network framework with temporal modeling for effective anomaly detection and root cause analysis in microservice architectures, handling complex dependencies and dynamic environments.
Contribution
It presents a novel combination of graph neural networks and temporal modeling to improve anomaly detection and root cause tracing in microservices.
Findings
Outperforms baseline methods in AUC, ACC, Recall, and F1-Score.
Maintains high accuracy and stability under dynamic topologies.
Effectively captures complex structural and temporal dependencies.
Abstract
This study addresses the problem of anomaly detection and root cause tracing in microservice architectures and proposes a unified framework that combines graph neural networks with temporal modeling. The microservice call chain is abstracted as a directed graph, where multidimensional features of nodes and edges are used to construct a service topology representation, and graph convolution is applied to aggregate features across nodes and model dependencies, capturing complex structural relationships among services. On this basis, gated recurrent units are introduced to model the temporal evolution of call chains, and multi-layer stacking and concatenation operations are used to jointly obtain structural and temporal representations, improving the ability to identify anomaly patterns. Furthermore, anomaly scoring functions at both the node and path levels are defined to achieve unified…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware System Performance and Reliability · Network Security and Intrusion Detection · Cloud Computing and Resource Management
