Mitigating Cold Starts in Serverless Platforms: A Pool-Based Approach

Ping-Min Lin; Alex Glikson

arXiv:1903.12221·cs.DC·April 1, 2019·50 cites

Mitigating Cold Starts in Serverless Platforms: A Pool-Based Approach

Ping-Min Lin, Alex Glikson

PDF

Open Access

TL;DR

This paper presents a pool-based approach to reduce cold start latency in serverless platforms, specifically Knative Serving, achieving significant improvements in response time by maintaining a pool of warm containers.

Contribution

The paper introduces a novel implementation of a warm container pool for Knative Serving, significantly reducing cold start latency compared to the original system.

Findings

01

85% reduction in P99 response time with the pool-based approach

02

Effective mitigation of cold start latency in serverless platforms

03

Demonstrated improvements in latency for bursty workloads

Abstract

Rapid adoption of the serverless (or Function-as-a-Service, FaaS) paradigm, pioneered by Amazon with AWS Lambda and followed by numerous commercial offerings and open source projects, introduces new challenges in designing the cloud infrastructure, balancing between performance and cost. While instant per-request elasticity that FaaS platforms typically offer application developers makes it possible to achieve high performance of bursty workloads without over-provisioning, such elasticity often involves extra latency associated with on-demand provisioning of individual runtime containers that serve the functions. This phenomenon is often called cold starts, as opposed to the situation when a function is served by a pre-provisioned "warm" container, ready to serve requests with close to zero overhead. Providers are constantly working on techniques aimed at reducing cold starts. A common…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCloud Computing and Resource Management · Software System Performance and Reliability · IoT and Edge/Fog Computing