Africanus I. Scalable, distributed and efficient radio data processing with Dask-MS and Codex Africanus
Simon J. Perkins, Jonathan S. Kenyon, Lexy A.L. Andati, Hertzog L., Bester, Oleg M. Smirnov, Benjamin V. Hugo

TL;DR
This paper introduces Dask-MS and Codex Africanus, Python libraries that enable scalable, distributed, and efficient radio data processing for large interferometer datasets, balancing performance, flexibility, and ease of development.
Contribution
It presents a novel framework leveraging Dask for distributed radio astronomy data processing, emphasizing open-source tools to handle increasing data volumes and complexity.
Findings
Enables high-performance distributed processing with Dask.
Balances flexibility, performance, and ease-of-use in radio data pipelines.
Facilitates scalable processing on HPC and cloud infrastructures.
Abstract
New radio interferometers such as MeerKAT, SKA, ngVLA, and DSA-2000 drive advancements in software for two key reasons. First, handling the vast data from these instruments requires subdivision and multi-node processing. Second, their improved sensitivity, achieved through better engineering and larger data volumes, demands new techniques to fully exploit it. This creates a critical challenge in radio astronomy software: pipelines must be optimized to process data efficiently, but unforeseen artefacts from increased sensitivity require ongoing development of new techniques. This leads to a trade-off among (1) performance, (2) flexibility, and (3) ease-of-development. Rigid designs often miss the full scope of the problem, while temporary research code is unsuitable for production. This work introduces a framework for developing radio astronomy techniques while balancing the above…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Physics and Python Applications · Advanced Computational Techniques and Applications · Speech Recognition and Synthesis
