TL;DR
This paper presents a predictive auto-scaling architecture for cloud services using machine learning-based time-series forecasting, integrated into OpenStack with Monasca, to anticipate load peaks and improve scaling responsiveness.
Contribution
It introduces a novel predictive auto-scaling framework that leverages machine learning models within OpenStack, extending Monasca for proactive resource management.
Findings
Predictive policies outperform reactive auto-scaling in anticipating load peaks.
Recurrent neural networks and multi-layer perceptrons provide accurate future metric predictions.
The framework is customizable and integrates seamlessly with existing OpenStack components.
Abstract
Cloud auto-scaling mechanisms are typically based on reactive automation rules that scale a cluster whenever some metric, e.g., the average CPU usage among instances, exceeds a predefined threshold. Tuning these rules becomes particularly cumbersome when scaling-up a cluster involves non-negligible times to bootstrap new instances, as it happens frequently in production cloud services. To deal with this problem, we propose an architecture for auto-scaling cloud services based on the status in which the system is expected to evolve in the near future. Our approach leverages on time-series forecasting techniques, like those based on machine learning and artificial neural networks, to predict the future dynamics of key metrics, e.g., resource consumption metrics, and apply a threshold-based scaling policy on them. The result is a predictive automation policy that is able, for instance,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLinear Regression
