Efficient data streaming multiway aggregation through concurrent   algorithmic designs and new abstract data types

Vincenzo Gulisano; Yiannis Nikolakopoulos; Daniel Cederman; Marina; Papatriantafilou; Philippas Tsigas

arXiv:1606.04746·cs.DS·June 16, 2016

Efficient data streaming multiway aggregation through concurrent algorithmic designs and new abstract data types

Vincenzo Gulisano, Yiannis Nikolakopoulos, Daniel Cederman, Marina, Papatriantafilou, Philippas Tsigas

PDF

TL;DR

This paper introduces new lock-free data structures and algorithms for efficient multiway data aggregation in streaming systems, significantly improving throughput and latency.

Contribution

It presents novel abstract data types and lock-free algorithms tailored for high-performance multiway aggregation in data streams, enabling better parallelism and efficiency.

Findings

01

Up to tenfold improvement in throughput and latency

02

Effective support for both order-sensitive and order-insensitive aggregates

03

Validated on large datasets from SoundCloud and Smart Grid networks

Abstract

Data streaming relies on continuous queries to process unbounded streams of data in a real-time fashion. It is commonly demanding in computation capacity, given that the relevant applications involve very large volumes of data. Data structures act as articulation points and maintain the state of data streaming operators, potentially supporting high parallelism and balancing the work between them. Prompted by this fact, in this work we study and analyze parallelization needs of these articulation points, focusing on the problem of streaming multiway aggregation, where large data volumes are received from multiple input streams. The analysis of the parallelization needs, as well as of the use and limitations of existing aggregate designs and their data structures, leads us to identify needs for proper shared objects that can achieve low-latency and high throughput multiway aggregation. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.