Performance Optimization in Stream Processing Systems: Experiment-Driven Configuration Tuning for Kafka Streams
David Chen, S\"oren Henning, Kassiano Matteussi, Rick Rabiser

TL;DR
This paper introduces an experiment-driven, automated configuration tuning approach for Kafka Streams in cloud environments, significantly improving throughput through a combination of sampling, stochastic search, and local refinement.
Contribution
It presents a novel multi-phase optimization workflow integrating Latin Hypercube Sampling, Simulated Annealing, and Hill Climbing for stream processing system configuration tuning.
Findings
Up to 23% throughput improvement over default configurations.
Latin Hypercube Sampling and Simulated Annealing are highly effective.
Hill Climbing provides limited additional benefits.
Abstract
Configuring stream processing systems for efficient performance, especially in cloud-native deployments, is a challenging and largely manual task. We present an experiment-driven approach for automated configuration optimization that combines three phases: Latin Hypercube Sampling for initial exploration, Simulated Annealing for guided stochastic search, and Hill Climbing for local refinement. The workflow is integrated with the cloud-native Theodolite benchmarking framework, enabling automated experiment orchestration on Kubernetes and early termination of underperforming configurations. In an experimental evaluation with Kafka Streams and a Kubernetes-based cloud testbed, our approach identifies configurations that improve throughput by up to 23% over the default. The results indicate that Latin Hypercube Sampling with early termination and Simulated Annealing are particularly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Database Systems and Queries · Cloud Computing and Resource Management · Software System Performance and Reliability
