Streaming Technologies and Serialization Protocols: Empirical Performance Analysis
Samuel Jackson, Nathan Cummings, Saiful Khan

TL;DR
This paper empirically evaluates various streaming technologies and serialization protocols, revealing significant performance differences and providing insights to optimize real-time data processing in scientific and industrial applications.
Contribution
It introduces an open-source benchmarking framework and offers comprehensive performance analysis of streaming solutions for high-volume data scenarios.
Findings
Significant performance variations among streaming technologies
Trade-offs identified between latency, throughput, and resource usage
Guidelines for selecting appropriate streaming protocols for specific tasks
Abstract
Efficient data streaming is essential for real-time data analytics, visualization, and machine learning model training, particularly when dealing with high-volume datasets. Various streaming technologies and serialization protocols have been developed to cater to different streaming requirements, each performing differently depending on specific tasks and datasets involved. This variety poses challenges in selecting the most appropriate combination, as encountered during the implementation of streaming system for the MAST fusion device data or SKA's radio astronomy data. To address this challenge, we conducted an empirical study on widely used data streaming technologies and serialization protocols. We also developed an extensible, open-source software framework to benchmark their efficiency across various performance metrics. Our study uncovers significant performance differences and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPeer-to-Peer Network Technologies · Caching and Content Delivery · Advanced Data Storage Technologies
