A Deep Reinforcement Learning based Algorithm for Time and Cost Optimized Scaling of Serverless Applications
Anupama Mampage, Shanika Karunasekera, Rajkumar Buyya

TL;DR
This paper presents a multi-agent deep reinforcement learning algorithm that optimizes time and cost in serverless application scaling, reducing cold starts and resource wastage.
Contribution
It introduces a novel multi-agent DRL approach for both horizontal and vertical scaling, considering system-wide requirements for improved efficiency.
Findings
Up to 23% reduction in application latency
Up to 34% decrease in request failures
Up to 45% savings in infrastructure costs
Abstract
Serverless computing has gained a strong traction in the cloud computing community in recent years. Among the many benefits of this novel computing model, the rapid auto-scaling capability of user applications takes prominence. However, the offer of adhoc scaling of user deployments at function level introduces many complications to serverless systems. The added delay and failures in function request executions caused by the time consumed for dynamically creating new resources to suit function workloads, known as the cold-start delay, is one such very prevalent shortcoming. Maintaining idle resource pools to alleviate this issue often results in wasted resources from the cloud provider perspective. Existing solutions to address this limitation mostly focus on predicting and understanding function load levels in order to proactively create required resources. Although these solutions…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · IoT and Edge/Fog Computing · Caching and Content Delivery
