Building a Distributed Computing System for LDMX: Challenges of creating and operating a lightweight e-infrastructure for small-to-medium size accelerator experiments
Lene Kristian Bryngemark, David Cameron, Valentina Dutta, Thomas, Eichlersmith, Balazs Konya, Omar Moreno, Geoffrey Mullier, Florido Paganelli,, Ruth P\"ottgen, Fuzzy Rogers, Andrii Salnikov, Paul Weakliem

TL;DR
This paper discusses the development and testing of a lightweight distributed computing system for the LDMX experiment, demonstrating that leveraging existing technologies can simplify infrastructure setup for small-to-medium scale scientific collaborations.
Contribution
It introduces a scalable, low-overhead distributed computing solution tailored for small-scale accelerator experiments, emphasizing integration of existing tools to reduce development effort.
Findings
Successful deployment of a pilot system for large-scale simulations
Significant reduction in setup and operational effort
Enhanced scalability and reproducibility of computing resources
Abstract
Particle physics experiments rely extensively on computing and data services, making e-infrastructure an integral part of the research collaboration. Constructing and operating distributed computing can however be challenging for a smaller-scale collaboration. The Light Dark Matter eXperiment (LDMX) is a planned small-scale accelerator-based experiment to search for dark matter in the sub-GeV mass region. Finalizing the design of the detector relies on Monte-Carlo simulation of expected physics processes. A distributed computing pilot project was proposed to better utilize available resources at the collaborating institutes, and to improve scalability and reproducibility. This paper outlines the chosen lightweight distributed solution, presenting requirements, the component integration steps, and the experiences using a pilot system for tests with large-scale simulations. The system…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Scientific Computing and Data Management · Advanced Data Storage Technologies
