A Cloud Native Platform for Stateful Streaming
Scott Schneider, Xavier Guerin, Shaohan Hu, Kun-Lung Wu

TL;DR
This paper introduces a cloud native version of IBM Streams built on Kubernetes, leveraging custom resources and design patterns to improve manageability and reduce platform code, with performance comparable to traditional systems.
Contribution
It presents a novel architecture replacing the original platform with Kubernetes, enabling better integration, reduced code, and demonstrating its effectiveness through experiments.
Findings
Kubernetes can effectively replace traditional platform management.
The new system reduces platform code by 75%.
Performance is adequate in most scenarios, with some issues.
Abstract
We present the architecture of a cloud native version of IBM Streams, with Kubernetes as our target platform. Streams is a general purpose streaming system with its own platform for managing applications and the compute clusters that execute those applications. Cloud native Streams replaces that platform with Kubernetes. By using Kubernetes as its platform, Streams is able to offload job management, life cycle tracking, address translation, fault tolerance and scheduling. This offloading is possible because we define custom resources that natively integrate into Kubernetes, allowing Streams to use Kubernetes' eventing system as its own. We use four design patterns to implement our system: controllers, conductors, coordinators and causal chains. Composing controllers, conductors and coordinators allows us to build deterministic state machines out of an asynchronous distributed system.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · Distributed systems and fault tolerance · Distributed and Parallel Computing Systems
