Fault Tolerance for Stream Processing Engines
Muhammad Anis Uddin Nasir

TL;DR
This paper surveys fault tolerance techniques in distributed stream processing engines, analyzing their approaches and implications for applications demanding low latency, high throughput, and high availability.
Contribution
It provides a comprehensive categorization and comparison of fault tolerance methods used in modern DSPEs like Storm, SparkStreaming, and MillWheel.
Findings
Active replication, passive replication, and upstream backup are key fault tolerance approaches.
Different techniques have varying impacts on latency, throughput, and scalability.
Fault tolerance strategies must be tailored to specific streaming application requirements.
Abstract
Distributed Stream Processing Engines (DSPEs) target applications related to continuous computation, online machine learning and real-time query processing. DSPEs operate on high volume of data by applying lightweight operations on real-time and continuous streams. Such systems require clusters of hundreds of machine for their deployment. Streaming applications come with various requirements, i.e., low-latency, high throughput, scalability and high availability. In this survey, we study the fault tolerance problem for DSPEs. We discuss fault tolerance techniques that are used in modern stream processing engines that are Storm, S4, Samza, SparkStreaming and MillWheel. Further, we give insight on fault tolerance approaches that we categorize as active replication, passive replication and upstream backup. Finally, we discuss implications of the fault tolerance techniques for different…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · Distributed systems and fault tolerance · IoT and Edge/Fog Computing
