A milestone for FaaS pipelines; object storage vs VM-driven data   exchange

Germ\'an T. Eizaguirre; Marc S\'anchez-Artigas; Pedro; Garc\'ia-L\'opez

arXiv:2207.12083·cs.DC·July 26, 2022

A milestone for FaaS pipelines; object storage vs VM-driven data exchange

Germ\'an T. Eizaguirre, Marc S\'anchez-Artigas, Pedro, Garc\'ia-L\'opez

PDF

TL;DR

This paper compares object storage and VM-driven data exchange in serverless function workflows, revealing that object storage can be an effective data passing method under certain conditions, challenging conventional assumptions.

Contribution

It provides an empirical evaluation of object storage performance in serverless data workflows, highlighting scenarios where it outperforms VM-based approaches.

Findings

01

Object storage can outperform VM-based shuffle stages in serverless workflows.

02

Performance depends on the number of functions used in shuffling stages.

03

Object storage is a viable data passing method in genomics pipelines.

Abstract

Serverless functions provide high levels of parallelism, short startup times, and "pay-as-you-go" billing. These attributes make them a natural substrate for data analytics workflows. However, the impossibility of direct communication between functions makes the execution of workflows challenging. The current practice to share intermediate data among functions is through remote object storage (e.g., IBM COS). Contrary to conventional wisdom, the performance of object storage is not well understood. For instance, object storage can even be superior to other simpler approaches like the execution of shuffle stages (e.g., GroupBy) inside powerful VMs to avoid all-to-all transfers between functions. Leveraging a genomics pipeline, we show that object storage is a reasonable choice for data passing when the appropriate number of functions is used in shuffling stages.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.