Modeling and Simulation of Spark Streaming
Jia-Chun Lin, Ming-Chang Lee, Ingrid Chieh Yu, Einar Broch Johnsen

TL;DR
This paper introduces SSP, a formal simulation model for Spark Streaming, enabling users to evaluate different configurations efficiently without deploying on actual clusters, thus optimizing performance and resource use.
Contribution
The paper presents SSP, a novel executable and configurable model in ABS for simulating Spark Streaming, aiding in parameter tuning and performance analysis.
Findings
SSP accurately mimics Spark Streaming in various scenarios.
Simulation helps identify optimal parameter configurations.
Model reduces the need for costly real-world testing.
Abstract
As more and more devices connect to Internet of Things, unbounded streams of data will be generated, which have to be processed "on the fly" in order to trigger automated actions and deliver real-time services. Spark Streaming is a popular realtime stream processing framework. To make efficient use of Spark Streaming and achieve stable stream processing, it requires a careful interplay between different parameter configurations. Mistakes may lead to significant resource overprovisioning and bad performance. To alleviate such issues, this paper develops an executable and configurable model named SSP (stands for Spark Streaming Processing) to model and simulate Spark Streaming. SSP is written in ABS, which is a formal, executable, and object-oriented language for modeling distributed systems by means of concurrent object groups. SSP allows users to rapidly evaluate and compare different…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
