Parallelizing Windowed Stream Joins in a Shared-Nothing Cluster
Abhirup Chakraborty, Ajit Singh

TL;DR
This paper presents a framework for parallelizing sliding window stream joins in shared-nothing clusters, addressing scalability and overhead issues, with experimental validation demonstrating its effectiveness.
Contribution
It introduces a novel framework for parallelizing sliding window stream joins in shared-nothing clusters, considering communication patterns and overheads.
Findings
Effective load distribution over cluster nodes
Scalability with increasing nodes
Improved join processing performance
Abstract
The availability of large number of processing nodes in a parallel and distributed computing environment enables sophisticated real time processing over high speed data streams, as required by many emerging applications. Sliding window stream joins are among the most important operators in a stream processing system. In this paper, we consider the issue of parallelizing a sliding window stream join operator over a shared nothing cluster. We propose a framework, based on fixed or predefined communication pattern, to distribute the join processing loads over the shared-nothing cluster. We consider various overheads while scaling over a large number of nodes, and propose solution methodologies to cope with the issues. We implement the algorithm over a cluster using a message passing system, and present the experimental results showing the effectiveness of the join processing algorithm.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Cloud Computing and Resource Management · Advanced Database Systems and Queries
