Time-Series Learning for Proactive Fault Prediction in Distributed Systems with Deep Neural Structures
Yang Wang, Wenxuan Zhu, Xuehui Quan, Heyi Wang, Chang Liu, Qiyuan Wu

TL;DR
This paper introduces a deep neural network-based method utilizing GRU and attention mechanisms for early fault prediction in distributed systems, demonstrating superior accuracy and stability over existing models through real-world data validation.
Contribution
It presents a novel temporal feature learning approach combining GRU, attention, and neural classification for proactive fault detection in distributed systems.
Findings
Outperforms mainstream time-series models in accuracy, F1-Score, and AUC.
Demonstrates strong prediction capability and stability.
Shows effective learning of system behavior patterns with reliable convergence.
Abstract
This paper addresses the challenges of fault prediction and delayed response in distributed systems by proposing an intelligent prediction method based on temporal feature learning. The method takes multi-dimensional performance metric sequences as input. We use a Gated Recurrent Unit (GRU) to model the evolution of system states over time. An attention mechanism is then applied to enhance key temporal segments, improving the model's ability to identify potential faults. On this basis, a feedforward neural network is designed to perform the final classification, enabling early warning of system failures. To validate the effectiveness of the proposed approach, comparative experiments and ablation analyses were conducted using data from a large-scale real-world cloud system. The experimental results show that the model outperforms various mainstream time-series models in terms of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Fault Detection and Control Systems · Advanced Computational Techniques and Applications
MethodsSoftmax · Attention Is All You Need
