Nephele Streaming: Stream Processing Under QoS Constraints At Scale
Bj\"orn Lohrmann, Daniel Warneke, Odej Kao

TL;DR
This paper introduces Nephele Streaming, a scalable stream processing approach that ensures QoS constraints are met, significantly reducing latency while maintaining throughput in large-scale data processing environments.
Contribution
It presents a novel distributed scheme for detecting QoS violations and optimizing job execution in parallel data processing frameworks, demonstrated with Nephele.
Findings
Latency improved by at least 13 times
High throughput preserved under QoS constraints
Effective detection and correction of QoS violations
Abstract
The ability to process large numbers of continuous data streams in a near-real-time fashion has become a crucial prerequisite for many scientific and industrial use cases in recent years. While the individual data streams are usually trivial to process, their aggregated data volumes easily exceed the scalability of traditional stream processing systems. At the same time, massively-parallel data processing systems like MapReduce or Dryad currently enjoy a tremendous popularity for data-intensive applications and have proven to scale to large numbers of nodes. Many of these systems also provide streaming capabilities. However, unlike traditional stream processors, these systems have disregarded QoS requirements of prospective stream processing applications so far. In this paper we address this gap. First, we analyze common design principles of today's parallel data processing frameworks…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
