Predicting SLA Violations in Real Time using Online Machine Learning
Jawwad Ahmed, Andreas Johnsson, Rerngvit Yanggratoke, John Ardelius,, Christofer Flinta, Rolf Stadler

TL;DR
This paper presents an online machine learning method for real-time prediction of SLA violations in telecom services, demonstrating high accuracy and low false alarms under dynamic load conditions.
Contribution
It introduces a service-agnostic online learning approach that predicts SLA violations using device metrics in streaming scenarios, outperforming traditional offline methods.
Findings
Achieves over 90% classification accuracy
Maintains less than 10% false alarm rate
Effective under changing load patterns
Abstract
Detecting faults and SLA violations in a timely manner is critical for telecom providers, in order to avoid loss in business, revenue and reputation. At the same time predicting SLA violations for user services in telecom environments is difficult, due to time-varying user demands and infrastructure load conditions. In this paper, we propose a service-agnostic online learning approach, whereby the behavior of the system is learned on the fly, in order to predict client-side SLA violations. The approach uses device-level metrics, which are collected in a streaming fashion on the server side. Our results show that the approach can produce highly accurate predictions (>90% classification accuracy and < 10% false alarm rate) in scenarios where SLA violations are predicted for a video-on-demand service under changing load patterns. The paper also highlight the limitations of traditional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Stream Mining Techniques · Software System Performance and Reliability · Imbalanced Data Classification Techniques
