Fallout: Distributed Systems Testing as a Service
Guy Bolton King, Sean McCarthy, Pushkala Pattabhiraman, Jake, Luciani, Matt Fleming

TL;DR
Fallout is an open-source distributed systems testing service that automates configuration, workload execution, and performance analysis, aiding validation across diverse cluster setups and supporting multiple open-source projects.
Contribution
This paper introduces Fallout, a comprehensive testing platform for distributed systems that automates setup, benchmarking, and reporting, with five years of operational experience at DataStax.
Findings
Automates configuration and benchmarking of distributed systems.
Supports diverse workloads and generates performance reports.
Operated successfully over 5 years in a dynamic environment.
Abstract
All modern distributed systems list performance and scalability as their core strengths. Given that optimal performance requires carefully selecting configuration options, and typical cluster sizes can range anywhere from 2 to 300 nodes, it is rare for any two clusters to be exactly the same. Validating the behavior and performance of distributed systems in this large configuration space is challenging without automation that stretches across the software stack. In this paper we present Fallout, an open-source distributed systems testing service that automatically provisions and configures distributed systems and clients, supports running a variety of workloads and benchmarks, and generates performance reports based on collected metrics for visual analysis. We have been running the Fallout service internally at DataStax for over 5 years and have recently open sourced it to support our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware System Performance and Reliability · Cloud Computing and Resource Management · Distributed systems and fault tolerance
