A Gateway to Astronomical Image Processing: Vera C. RubinObservatory LSST Science Pipelines on AWS
Dino Bektesevic, Hsin-Fang Chiang, Kian-Tat Lim, Todd L., Miller, Greg Thain, Tim Jenness, James Bosch, Andrei Salnikov and, Andrew Connolly

TL;DR
This paper explores deploying the Rubin LSST Science Pipelines on AWS to handle the massive data volume from the Vera C. Rubin Observatory's decade-long sky survey, enabling scalable analysis of astronomical images and catalogs.
Contribution
It presents the initial implementation and evaluation of executing LSST Science Pipelines on AWS, addressing scalability and performance challenges for large-scale astronomical data processing.
Findings
AWS deployment shows promising scalability and performance.
Cost analysis indicates feasible cloud-based processing.
Initial results support cloud infrastructure for future astronomical data analysis.
Abstract
The Legacy Survey of Space and Time, operated by the Vera C. Rubin Observatory, is a 10-year astronomical survey due to start operations in 2022 that will image half the sky every three nights. LSST will produce ~20TB of raw data per night which will be calibrated and analyzed in almost real time. Given the volume of LSST data, the traditional subset-download-process paradigm of data reprocessing faces significant challenges. We describe here, the first steps towards a gateway for astronomical science that would enable astronomers to analyze images and catalogs at scale. In this first step we focus on executing the Rubin LSST Science Pipelines, a collection of image and catalog processing algorithms, on Amazon Web Services (AWS). We describe our initial impressions on the performance, scalability and cost of deploying such a system in the cloud.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Distributed and Parallel Computing Systems · Advanced Data Storage Technologies
