Container-Based Pre-Pipeline Data Processing on HPC for XRISM
Satoshi Eguchi, Makoto Tashiro, Yukikatsu Terada, Hiromitsu Takahashi,, Masayoshi Nobukawa, Ken Ebisawa, Katsuhiro Hayashi, Tessei Yoshida, Yoshiaki, Kanemaru, Shoji Ogawa, Matthew P. Holland, Michael Loewenstein, Eric D., Miller, Tahir Yaqoob, Robert S. Hill, Morgan D. Waddy

TL;DR
This paper demonstrates how container-based pre-pipeline data processing for the XRISM satellite was successfully ported to an HPC system, significantly improving processing efficiency using Singularity containers.
Contribution
The paper presents a novel approach for porting XRISM's pre-pipeline software to HPC systems using container technology, enabling faster data processing.
Findings
Ported PPL to HPC system using Singularity containers.
Achieved processing of approximately 160 PPL processes within 24 hours.
Detailed porting strategy for HPC environment.
Abstract
The X-Ray Imaging and Spectroscopy Mission (XRISM) is the 7th Japanese X-ray observatory, whose development and operation are in collaboration with universities and research institutes in Japan, U.S., and Europe, including JAXA, NASA, and ESA. The telemetry data downlinked from the satellite are reduced to scientific products by the pre-pipeline (PPL) and pipeline (PL) software running on standard Linux virtual machines on the JAXA and NASA sides, respectively. We ported the PPL to the JAXA "TOKI-RURI" high-performance computing (HPC) system capable of completing PPL processes within 24 hours by utilizing the container platform of Singularity and its "--bind" option. In this paper, we briefly show the data processing in XRISM and present our porting strategy of PPL to the HPC environment in detail.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Parallel Computing and Optimization Techniques · Advanced Data Storage Technologies
