Towards Implementing ML-Based Failure Detectors
Xiaonan Li, Olivier Marin

TL;DR
This paper investigates the potential of machine learning, specifically LSTM neural networks, for failure detection, demonstrating promising accuracy and detection speed despite higher computational costs.
Contribution
It presents a prototype implementation of an ML-based failure detector using LSTM, showing its feasibility and performance with real data traces.
Findings
ML-based detector achieves high accuracy
Detection time is acceptable despite longer computation
Prototype demonstrates viability of ML in failure detection
Abstract
Most existing failure detection algorithms rely on statistical methods, and very few use machine learning (ML). This paper explores the viability of ML in the field of failure detection: is it possible to implement an ML-based detector that achieves a satisfactory quality of service? We implement a prototype that uses a basic long short-term memory neural network algorithm, and study its behavior with real traces. Although ML model has comparatively longer computing time, our prototype performs well in terms of accuracy and detection time.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Network Security and Intrusion Detection · Fault Detection and Control Systems
