Railgun: managing large streaming windows under MAD requirements
Ana Sofia Gomes, Jo\~ao Oliveirinha, Pedro Cardoso, Pedro Bizarro

TL;DR
Railgun is a distributed streaming system designed to efficiently handle large, long-duration sliding windows with high throughput and low latency, suitable for mission-critical applications like fraud detection.
Contribution
It introduces a fault-tolerant, elastic streaming system that accurately manages real-time sliding windows under high load and strict latency requirements, outperforming existing solutions.
Findings
Lower latency than Flink in benchmarks
Low memory usage independent of window size
Scales nearly linearly under high load
Abstract
Some mission critical systems, e.g., fraud detection, require accurate, real-time metrics over long time sliding windows on applications that demand high throughput and low latencies. As these applications need to run 'forever' and cope with large, spiky data loads, they further require to be run in a distributed setting. We are unaware of any streaming system that provides all those properties. Instead, existing systems take large simplifications, such as implementing sliding windows as a fixed set of overlapping windows, jeopardizing metric accuracy (violating regulatory rules) or latency (breaching service agreements). In this paper, we propose Railgun, a fault-tolerant, elastic, and distributed streaming system supporting real-time sliding windows for scenarios requiring high loads and millisecond-level latencies. We benchmarked an initial prototype of Railgun using real data,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmbedded Systems and FPGA Design · Advanced Data Storage Technologies
