TL;DR
This paper introduces OXN, an automated tool for systematically evaluating observability configurations in microservice applications by injecting faults and modifying observability settings to inform better architectural decisions.
Contribution
The paper presents OXN, a novel automation framework that enables systematic and repeatable experiments on observability design choices in cloud-native applications.
Findings
OXN can inject faults and modify observability settings automatically.
Experiments with OXN reveal trade-offs in observability configurations.
OXN facilitates evidence-based decision-making for microservice observability.
Abstract
Observability is important to ensure the reliability of microservice applications. These applications are often prone to failures, since they have many independent services deployed on heterogeneous environments. When employed "correctly", observability can help developers identify and troubleshoot faults quickly. However, instrumenting and configuring the observability of a microservice application is not trivial but tool-dependent and tied to costs. Practitioners need to understand observability-related trade-offs in order to weigh between different observability design alternatives. Still, these architectural design decisions are not supported by systematic methods and typically just rely on "professional intuition". To assess observability design trade-offs with concrete evidence, we advocate for conducting experiments that compare various design alternatives. Achieving a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
