Towards Implementing ML-Based Failure Detectors

Xiaonan Li; Olivier Marin

arXiv:2210.00134·cs.DC·October 4, 2022·1 cites

Towards Implementing ML-Based Failure Detectors

Xiaonan Li, Olivier Marin

PDF

Open Access

TL;DR

This paper investigates the potential of machine learning, specifically LSTM neural networks, for failure detection, demonstrating promising accuracy and detection speed despite higher computational costs.

Contribution

It presents a prototype implementation of an ML-based failure detector using LSTM, showing its feasibility and performance with real data traces.

Findings

01

ML-based detector achieves high accuracy

02

Detection time is acceptable despite longer computation

03

Prototype demonstrates viability of ML in failure detection

Abstract

Most existing failure detection algorithms rely on statistical methods, and very few use machine learning (ML). This paper explores the viability of ML in the field of failure detection: is it possible to implement an ML-based detector that achieves a satisfactory quality of service? We implement a prototype that uses a basic long short-term memory neural network algorithm, and study its behavior with real traces. Although ML model has comparatively longer computing time, our prototype performs well in terms of accuracy and detection time.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Network Security and Intrusion Detection · Fault Detection and Control Systems