Optimal Operator State Migration for Elastic Data Stream Processing
Jianbing Ding, Tom Z. J. Fu, Richard T. B. Ma, Marianne Winslett, Yin, Yang, Zhenjie Zhang, Hongyang Chao

TL;DR
This paper introduces an optimized approach for migrating operator states in elastic data stream processing systems, reducing delays and costs during dynamic node changes.
Contribution
It presents the first comprehensive study and algorithms for live, progressive, and optimized operator state migration in elastic DSMS.
Findings
Significant reduction in migration-induced delays
Lower synchronization overhead during migration
Efficient task assignment minimizes migration costs
Abstract
A cloud-based data stream management system (DSMS) handles fast data by utilizing the massively parallel processing capabilities of the underlying platform. An important property of such a DSMS is elasticity, meaning that nodes can be dynamically added to or removed from an application to match the latter's workload, which may fluctuate in an unpredictable manner. For an application involving stateful operations such as aggregates, the addition / removal of nodes necessitates the migration of operator states. Although the importance of migration has been recognized in existing systems, two key problems remain largely neglected, namely how to migrate and what to migrate, i.e., the migration mechanism that reduces synchronization overhead and result delay during migration, and the selection of the optimal task assignment that minimizes migration costs. Consequently, migration in current…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Database Systems and Queries · Data Stream Mining Techniques · Cloud Computing and Resource Management
