Learning from the present for the future: the Juelich LOFAR Long-term Archive
C. Manzano, A. Miskolczi, H. Stiele., V. Vybornov, T. Fieseler, S., Pfalzner

TL;DR
This paper reviews the management of the Juelich LOFAR long-term radio astronomy data archive, analyzing its current state, bottlenecks, and proposing solutions for future large-scale data storage needs.
Contribution
It provides a comprehensive assessment of current data management practices and identifies key hardware and software bottlenecks for long-term radio astronomy data archives.
Findings
Data access patterns reveal bottlenecks in tape retrieval and staging.
Hardware limitations include network bandwidth and cache performance.
Software workflows require tuning to improve data retrieval efficiency.
Abstract
The Forschungszentrum Juelich has been hosting the German part of the LOFAR archive since 2013. It is Germany's most extensive radio astronomy archive, currently storing nearly 22 petabytes (PB) of data. Future radio telescopes are expected to require a dramatic increase in long-term data storage. Here, we take stock of the current data management of the Juelich LOFAR Data Archive, describe the ingestion, the storage system, the export to the long-term archive, and the request chain. We analysed the data availability over the last 10 years and searched for the underlying data access pattern and the energy consumption of the process. We determine hardware-related limiting factors, such as network bandwidth and cache pool availability and performance, and software aspects, e.g. workflow adjustment and parameter tuning, as the main data storage bottlenecks. By contrast, the challenge in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
