MSARS: A Meta-Learning and Reinforcement Learning Framework for SLO Resource Allocation and Adaptive Scaling for Microservices
Kan Hu, Linfeng Wen, Minxian Xu, Kejiang Ye

TL;DR
MSARS is a novel framework combining meta-learning and reinforcement learning to rapidly allocate resources and adaptively scale microservices, reducing SLO violations and resource costs in dynamic cloud environments.
Contribution
The paper introduces MSARS, a framework that integrates graph neural networks, meta-learning, and improved reinforcement learning for efficient SLO resource allocation and microservice auto-scaling.
Findings
MSARS reduces adaptation time by 40% compared to existing methods.
It achieves a 38% reduction in SLO violations.
Resource costs are decreased by 8% with MSARS.
Abstract
Service Level Objectives (SLOs) aim to set threshold for service time in cloud services to ensure acceptable quality of service (QoS) and user satisfaction. Currently, many studies consider SLOs as a system resource to be allocated, ensuring QoS meets the SLOs. Existing microservice auto-scaling frameworks that rely on SLO resources often utilize complex and computationally intensive models, requiring significant time and resources to determine appropriate resource allocation. This paper aims to rapidly allocate SLO resources and minimize resource costs while ensuring application QoS meets the SLO requirements in a dynamically changing microservice environment. We propose MSARS, a framework that leverages meta-learning to quickly derive SLO resource allocation strategies and employs reinforcement learning for adaptive scaling of microservice resources. It features three innovative…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · Software System Performance and Reliability · IoT and Edge/Fog Computing
