TL;DR
This paper introduces Theodolite, a benchmarking method for evaluating the scalability of distributed stream processing engines like Kafka Streams and Flink within microservice architectures, focusing on use cases and workload dimensions.
Contribution
It defines a systematic approach for benchmarking scalability through use cases, workload dimensions, and a flexible framework applicable to cloud deployments.
Findings
Theodolite effectively measures scalability across different workloads.
Kafka Streams and Flink scalability varies with deployment options.
Four use cases and seven workload dimensions guide benchmarking efforts.
Abstract
Distributed stream processing engines are designed with a focus on scalability to process big data volumes in a continuous manner. We present the Theodolite method for benchmarking the scalability of distributed stream processing engines. Core of this method is the definition of use cases that microservices implementing stream processing have to fulfill. For each use case, our method identifies relevant workload dimensions that might affect the scalability of a use case. We propose to design one benchmark per use case and relevant workload dimension. We present a general benchmarking framework, which can be applied to execute the individual benchmarks for a given use case and workload dimension. Our framework executes an implementation of the use case's dataflow architecture for different workloads of the given dimension and various numbers of processing instances. This way, it…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
