AIR: A Light-Weight Yet High-Performance Dataflow Engine based on   Asynchronous Iterative Routing

Vinu E. Venugopal; Martin Theobald; Samira Chaychi; Amal Tawakuli

arXiv:2001.00164·cs.DC·January 6, 2020

AIR: A Light-Weight Yet High-Performance Dataflow Engine based on Asynchronous Iterative Routing

Vinu E. Venugopal, Martin Theobald, Samira Chaychi, Amal Tawakuli

PDF

2 Repos

TL;DR

AIR is a high-performance, lightweight dataflow engine built with C++ and MPI that achieves significantly lower latency and higher throughput than Spark and Flink by using asynchronous communication and avoiding master node bottlenecks.

Contribution

The paper introduces AIR, a novel dataflow engine designed from scratch with MPI and pthreads, implementing asynchronous routing to improve performance and scalability over existing systems.

Findings

01

AIR outperforms Spark and Flink by up to 15 times in latency and throughput.

02

AIR scales more effectively on clusters up to 8 nodes and 224 cores.

03

The architecture reduces control flow overhead by eliminating the master node.

Abstract

Distributed Stream Processing Systems (DSPSs) are among the currently most emerging topics in data management, with applications ranging from real-time event monitoring to processing complex dataflow programs and big data analytics. The major market players in this domain are clearly represented by Apache Spark and Flink, which provide a variety of frontend APIs for SQL, statistical inference, machine learning, stream processing, and many others. Yet rather few details are reported on the integration of these engines into the underlying High-Performance Computing (HPC) infrastructure and the communication protocols they use. Spark and Flink, for example, are implemented in Java and still rely on a dedicated master node for managing their control flow among the worker nodes in a compute cluster. In this paper, we describe the architecture of our AIR engine, which is designed from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.