Integrative Dynamic Reconfiguration in a Parallel Stream Processing Engine
Kasper Grud Skat Madsen, Yongluan Zhou, Jianneng Cao

TL;DR
This paper presents an integrated optimization approach for load balancing, operator placement, and scaling in parallel stream processing engines, improving performance and resource utilization.
Contribution
It models the coupled problems as a single optimization problem and introduces ALBIC, an extended solution supporting general jobs, implemented on Apache Storm.
Findings
Outperforms existing approaches in experiments
Effectively balances load and reduces communication costs
Supports a wide range of job types
Abstract
Load balancing, operator instance collocations and horizontal scaling are critical issues in Parallel Stream Processing Engines to achieve low data processing latency, optimized cluster utilization and minimized communication cost respectively. In previous work, these issues are typically tackled separately and independently. We argue that these problems are tightly coupled in the sense that they all need to determine the allocations of workloads and migrate computational states at runtime. Optimizing them independently would result in suboptimal solutions. Therefore, in this paper, we investigate how these three issues can be modeled as one integrated optimization problem. In particular, we first consider jobs where workload allocations have little effect on the communication cost, and model the problem of load balance as a Mixed-Integer Linear Program. Afterwards, we present an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · Graph Theory and Algorithms · Distributed and Parallel Computing Systems
