Large Data Acquisition and Analytics at Synchrotron Radiation Facilities
Aashish Panta, Giorgio Scorzelli, Amy A. Gooch, Werner Sun, Katherine S. Shanks, Suchismita Sarker, Devin Bougie, Keara Soloway, Rolf Verberg, Tracy Berman, Glenn Tarcea, John Allison, Michela Taufer, Valerio Pascucci

TL;DR
This paper presents a comprehensive framework for real-time remote monitoring and management of massive data generated at synchrotron radiation facilities, significantly enhancing operational efficiency and data accessibility.
Contribution
It introduces a novel, deployed system that manages terabytes of data and over 10 million files, enabling remote experiment monitoring and data quality assessment at CHESS.
Findings
Managed 50-100 TB of data across three beamlines.
Reduced operational overhead and improved data accessibility.
Streamlined data workflows for research groups.
Abstract
Synchrotron facilities like the Cornell High Energy Synchrotron Source (CHESS) generate massive data volumes from complex beamline experiments, but face challenges such as limited access time, the need for on-site experiment monitoring, and managing terabytes of data per user group. We present the design, deployment, and evaluation of a framework that addresses CHESS's data acquisition and management issues. Deployed on a secure CHESS server, our system provides real time, web-based tools for remote experiment monitoring and data quality assessment, improving operational efficiency. Implemented across three beamlines (ID3A, ID3B, ID4B), the framework managed 50-100 TB of data and over 10 million files in late 2024. Testing with 43 research groups and 86 dashboards showed reduced overhead, improved accessibility, and streamlined data workflows. Our paper highlights the development,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParticle Accelerators and Free-Electron Lasers · Distributed and Parallel Computing Systems · Scientific Computing and Data Management
