# Quantitative Impact Evaluation of an Abstraction Layer for Data Stream   Processing Systems

**Authors:** Guenter Hesse, Christoph Matthies, Kelvin Glass, Johannes Huegle,, Matthias Uflacker

arXiv: 1907.08302 · 2019-07-22

## TL;DR

This paper evaluates the performance impact of using Apache Beam as an abstraction layer across multiple data stream processing frameworks, revealing significant slowdowns and variability in execution times.

## Contribution

It introduces a novel benchmark architecture to compare the performance of Apache Beam on different streaming frameworks, highlighting the associated costs.

## Key findings

- Apache Beam causes high variance in query execution times.
- Performance slowdown can be up to 58 times compared to native implementations.
- Benchmark artifacts are publicly available for reproducibility.

## Abstract

With the demand to process ever-growing data volumes, a variety of new data stream processing frameworks have been developed. Moving an implementation from one such system to another, e.g., for performance reasons, requires adapting existing applications to new interfaces. Apache Beam addresses these high substitution costs by providing an abstraction layer that enables executing programs on any of the supported streaming frameworks. In this paper, we present a novel benchmark architecture for comparing the performance impact of using Apache Beam on three streaming frameworks: Apache Spark Streaming, Apache Flink, and Apache Apex. We find significant performance penalties when using Apache Beam for application development in the surveyed systems. Overall, usage of Apache Beam for the examined streaming applications caused a high variance of query execution times with a slowdown of up to a factor of 58 compared to queries developed without the abstraction layer. All developed benchmark artifacts are publicly available to ensure reproducible results.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1907.08302/full.md

## Figures

13 figures with captions in the complete paper: https://tomesphere.com/paper/1907.08302/full.md

## References

68 references — full list in the complete paper: https://tomesphere.com/paper/1907.08302/full.md

---
Source: https://tomesphere.com/paper/1907.08302