Scaling Ordered Stream Processing on Shared-Memory Multicores
Guna Prasaad, G. Ramalingam, Kaushik Rajan

TL;DR
This paper presents new methods for effectively parallelizing ordered stream processing on shared-memory multicore systems, balancing low latency, load imbalance, and order preservation.
Contribution
It introduces a non-blocking concurrent data structure, a load-aware parallelization approach, and an adaptive runtime with heuristics for optimizing ordered streaming computations.
Findings
Heuristics exploiting pipeline parallelism outperform data parallelism.
The proposed data structures reduce latency in ordered output processing.
Adaptive runtime improves resource utilization and throughput.
Abstract
Many modern applications require real-time processing of large volumes of high-speed data. Such data processing needs can be modeled as a streaming computation. A streaming computation is specified as a dataflow graph that exposes multiple opportunities for parallelizing its execution, in the form of data, pipeline and task parallelism. On the other hand, many important applications require that processing of the stream be ordered, where inputs are processed in the same order as they arrive. There is a fundamental conflict between ordered processing and parallelizing the streaming computation. This paper focuses on the problem of effectively parallelizing ordered streaming computations on a shared-memory multicore machine. We first address the key challenges in exploiting data parallelism in the ordered setting. We present a low-latency, non-blocking concurrent data structure to order…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
