A Deep Recurrent-Reinforcement Learning Method for Intelligent   AutoScaling of Serverless Functions

Siddharth Agarwal; Maria A. Rodriguez; Rajkumar Buyya

arXiv:2308.05937·cs.DC·November 13, 2024

A Deep Recurrent-Reinforcement Learning Method for Intelligent AutoScaling of Serverless Functions

Siddharth Agarwal, Maria A. Rodriguez, Rajkumar Buyya

PDF

Open Access 1 Repo

TL;DR

This paper proposes a deep recurrent reinforcement learning approach for autoscaling serverless functions, demonstrating improved performance over traditional threshold-based methods in dynamic cloud environments.

Contribution

It introduces a LSTM-enhanced PPO algorithm for function autoscaling, addressing partial observability and outperforming existing threshold-based autoscaling strategies.

Findings

01

Recurrent RL agents effectively model environment dynamics.

02

LSTM-based autoscaling improves throughput by 18%.

03

It increases function execution by 13% and scales 8.4% more instances.

Abstract

FaaS introduces a lightweight, function-based cloud execution model that finds its relevance in a range of applications like IoT-edge data processing and anomaly detection. While cloud service providers offer a near-infinite function elasticity, these applications often experience fluctuating workloads and stricter performance constraints. A typical CSP strategy is to empirically determine and adjust desired function instances or resources, known as autoscaling, based on monitoring-based thresholds such as CPU or memory, to cope with demand and performance. However, threshold configuration either requires expert knowledge, historical data or a complete view of the environment, making autoscaling a performance bottleneck that lacks an adaptable solution. RL algorithms are proven to be beneficial in analysing complex cloud environments and result in an adaptable policy that maximizes the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

cloudslab/dre-scale
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCloud Computing and Resource Management · IoT and Edge/Fog Computing · Age of Information Optimization

Methodstravel james · Sigmoid Activation · Tanh Activation · Entropy Regularization · Long Short-Term Memory · Proximal Policy Optimization