Compile-Time Java Stream Fusion via mapMulti
Yegor Bugayenko, Maxim Trunnikov, Vladimir Zakharov

TL;DR
This paper presents an open-source Java optimizer that merges consecutive stream operations into a single mapMulti() call, reducing overhead and improving performance.
Contribution
It introduces a novel compile-time optimization for Java streams that overcomes limitations of previous tools by merging operations into mapMulti(), enhancing efficiency.
Findings
Achieved better performance in two benchmarks
Maintained comparable performance in most benchmarks
Successfully optimized Apache Kafka bytecode with all tests passing
Abstract
The Java Stream API, introduced in Java 8, makes data processing more expressive and concise compared to imperative loops. However, this abstraction can come with significant performance overhead, often due to the creation of multiple intermediate objects during pipeline execution. In functional languages such as Haskell, this problem is addressed through stream fusion, a compile-time optimization that eliminates unnecessary intermediate structures. Inspired by this idea, Streamliner was the first tool to perform ahead-of-time, bytecode-to-bytecode stream optimization for Java by unrolling stream pipelines into imperative loops. In this paper, we introduce an open-source optimizer that takes a different approach. Instead of unrolling streams into loops, it merges consecutive map() and filter() operations into a single mapMulti() call, available since Java 16. Our method avoids several…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
