FaaSter Troubleshooting -- Evaluating Distributed Tracing Approaches for   Serverless Applications

Maria C. Borges; Sebastian Werner; Ahmet Kilic

arXiv:2110.03471·cs.SE·July 16, 2024

FaaSter Troubleshooting -- Evaluating Distributed Tracing Approaches for Serverless Applications

Maria C. Borges, Sebastian Werner, Ahmet Kilic

PDF

TL;DR

This paper evaluates distributed tracing methods to improve fault detection in serverless applications, comparing developer-driven and platform-supported approaches through a model and empirical measurements.

Contribution

It introduces a fault observability model for serverless applications and compares two distributed tracing approaches, providing insights into their trade-offs and effectiveness.

Findings

01

Platform-supported tracing reduces troubleshooting time.

02

Developer-driven tracing offers more detailed fault insights.

03

Trade-offs include increased latency and resource use.

Abstract

Serverless applications can be particularly difficult to troubleshoot, as these applications are often composed of various managed and partly managed services. Faults are often unpredictable and can occur at multiple points, even in simple compositions. Each additional function or service in a serverless composition introduces a new possible fault source and a new layer to obfuscate faults. Currently, serverless platforms offer only limited support for identifying runtime faults. Developers looking to observe their serverless compositions often have to rely on scattered logs and ambiguous error messages to pinpoint root causes. In this paper, we investigate the use of distributed tracing for improving the observability of faults in serverless applications. To this end, we first introduce a model for characterizing fault observability, then provide a prototypical tracing implementation -…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.