Distributed Real-Time Data Stream Analysis for CTA
Kai Br\"ugge, Alexey Egorov, Christian Bockermann, Katharina, Morik, Wolfgang Rhode

TL;DR
This paper explores distributed streaming technologies like Spark, Flink, and Storm to enable real-time analysis of CTA gamma-ray telescope data, addressing bottlenecks in processing large data streams from multiple telescopes.
Contribution
It presents a comparative investigation of distributed streaming platforms and a prototype system for real-time CTA data analysis using abstraction layers for platform independence.
Findings
Distributed streaming platforms can handle CTA's high data throughput.
Abstraction layers enable code portability across different streaming engines.
Prototype demonstrates real-time analysis capability for CTA data streams.
Abstract
Once completed, the Cherenkov Telescope Array (CTA) will be able to map the gamma-ray sky in a wide energy range from several tens of GeV to some hundreds of TeV and will be more sensitive than previous experiments by an order of magnitude. It opens up the opportunity to observe transient phenomena like gamma-ray bursts (GRBs) and flaring active galactic nuclei (AGN). In order to successfully trigger multi-wavelength observations of transients, CTA has to be able to alert other observatories as quickly as possible. Multi-wavelength observations are essential for gaining insights into the processes occurring within these sources of such high energy radiation. CTA will consist of approximately 100 telescopes of different sizes and designs. Images are streamed from all the telescopes into a central computing facility on site. During observation CTA will produce a stream of up to 20 000…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Distributed and Parallel Computing Systems
