Data Provenance and Management in Radio Astronomy: A Stream Computing Approach
Mahmoud S. Mahmoud, Andrew Ensor, Alain Biem, Bruce Elmegreen and, Sergei Gulyaev

TL;DR
This paper explores a stream-computing approach using IBM InfoSphere Streams and hardware accelerators to improve data provenance and management in large-scale radio astronomy projects like the Square Kilometer Array.
Contribution
It demonstrates the viability of stream computing and accelerators for managing large, real-time radio astronomy data pipelines, with a case study on autocorrelating spectrometers.
Findings
Effective real-time data management with InfoSphere Streams
Advantages of stream computing over traditional methods
Successful implementation of an autocorrelating spectrometer
Abstract
New approaches for data provenance and data management (DPDM) are required for mega science projects like the Square Kilometer Array, characterized by extremely large data volume and intense data rates, therefore demanding innovative and highly efficient computational paradigms. In this context, we explore a stream-computing approach with the emphasis on the use of accelerators. In particular, we make use of a new generation of high performance stream-based parallelization middleware known as InfoSphere Streams. Its viability for managing and ensuring interoperability and integrity of signal processing data pipelines is demonstrated in radio astronomy. IBM InfoSphere Streams embraces the stream-computing paradigm. It is a shift from conventional data mining techniques (involving analysis of existing data from databases) towards real-time analytic processing. We discuss using InfoSphere…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Database Systems and Queries · Scientific Computing and Data Management · Service-Oriented Architecture and Web Services
