Skitter: A Distributed Stream Processing Framework with Pluggable Distribution Strategies
Mathijs Saey (Vrije Universiteit Brussel, Belgium), Joeri De Koster, (Vrije Universiteit Brussel, Belgium), Wolfgang De Meuter (Vrije Universiteit, Brussel, Belgium)

TL;DR
Skitter introduces a modular programming model and domain-specific language for distributed stream processing that decouples data processing from distribution strategies, enabling flexible and efficient application development.
Contribution
It presents a novel decoupling approach and a DSL called Skitter, allowing modular creation and integration of distribution strategies in stream processing frameworks.
Findings
Skitter enables modular implementation of distribution strategies.
The framework maintains high-level abstraction while allowing low-level customization.
Performance evaluations show competitive efficiency of Skitter-based strategies.
Abstract
Context: Distributed Stream Processing Frameworks (DSPFs) are popular tools for expressing real-time Big Data applications that have to handle enormous volumes of data in real time. These frameworks distribute their applications over a cluster in order to scale horizontally along with the amount of incoming data. Inquiry: Crucial for the performance of such applications is the **distribution strategy** that is used to partition data and computations over the cluster nodes. In some DSPFs, like Apache Spark or Flink, the distribution strategy is hardwired into the framework which can lead to inefficient applications. The other end of the spectrum is offered by Apache Storm, which offers a low-level model wherein programmers can implement their own distribution strategies on a per-application basis to improve efficiency. However, this model conflates distribution and data processing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
