An Empirical Evaluation of Serverless Cloud Infrastructure for   Large-Scale Data Processing

Thomas Bodner; Theo Radig; David Justen; Daniel Ritter; Tilmann Rabl

arXiv:2501.07771·cs.DB·January 15, 2025·2 cites

An Empirical Evaluation of Serverless Cloud Infrastructure for Large-Scale Data Processing

Thomas Bodner, Theo Radig, David Justen, Daniel Ritter, Tilmann Rabl

PDF

Open Access 1 Repo

TL;DR

This paper provides a comprehensive analysis of serverless cloud infrastructure for large-scale data processing, highlighting performance variability, cost considerations, and practical guidelines for effective use.

Contribution

It introduces the Skyrise evaluation platform and offers detailed performance and cost insights for serverless data processing on AWS.

Findings

01

Identifies performance variability boundaries in serverless networks and storage.

02

Provides cost break-even points for serverless compute and storage.

03

Offers practical guidance for deploying serverless data processing systems.

Abstract

Data processing systems are increasingly deployed in the cloud. While monolithic systems run fully on virtual servers, recent systems embrace cloud infrastructure and utilize the disaggregation of compute and storage to scale them independently. The introduction of serverless compute services, such as AWS Lambda, enables finer-grained and elastic scalability within these systems. Prior work shows the viability of serverless infrastructure for scalable data processing yet also sees limitations due to variable performance and cost overhead, in particular for networking and storage. In this paper, we perform a detailed analysis of the performance and cost characteristics of serverless infrastructure in the data processing context. We base our analysis on a large series of micro-benchmarks across different compute and storage services, as well as end-to-end workloads. To enable our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hpides/skyrise
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCloud Computing and Resource Management