Delivery, consistency, and determinism: rethinking guarantees in distributed stream processing
Artem Trofimov, Igor E. Kuralenok, Nikita Marshalkin, Boris Novikov

TL;DR
This paper introduces a formal framework for stream processing guarantees, showing that lightweight determinism can enable exactly-once delivery with minimal performance overhead, outperforming existing solutions.
Contribution
It provides a formal definition of streaming guarantees and demonstrates how lightweight determinism can achieve exactly-once delivery efficiently.
Findings
Lightweight determinism enables nearly overhead-free exactly-once delivery.
The proposed approach significantly outperforms existing industrial solutions.
Properties of delivery, consistency, and determinism are closely interconnected.
Abstract
Consistency requirements for state-of-the-art stream processing systems are defined in terms of delivery guarantees. Exactly-once is the strongest one and the most desirable for end-user. However, there are several issues regarding this concept. Commonly used techniques that enforce exactly-once produce significant performance overhead. Besides, the notion of exactly-once is not formally defined and does not capture all properties that provide stream processing systems supporting this guarantee. In this paper, we introduce a formal framework that allows us to define streaming guarantees more regularly. We demonstrate that the properties of delivery, consistency, and determinism are tightly connected within distributed stream processing. We also show that having lightweight determinism, it is possible to provide exactly-once with almost no performance overhead. Experiments show that the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed systems and fault tolerance · Advanced Database Systems and Queries · Cloud Computing and Resource Management
