Surrogate Modeling for Scalable Evaluation of Distributed Computing Systems for HEP Applications
Larissa Schmid, Maximilian Horzela, Valerii Zhyla, Manuel Giffels, G\"unter Quast, Anne Koziolek

TL;DR
This paper explores using machine learning surrogate models to efficiently simulate distributed computing systems for high-energy physics, achieving faster predictions with acceptable accuracy compared to traditional simulators.
Contribution
It evaluates three ML models for simulating distributed systems, demonstrating their potential to generalize and significantly speed up evaluations for HEP applications.
Findings
ML models predict key observables with approximate accuracy
Surrogate models run orders of magnitude faster than traditional simulators
Potential for improving prediction accuracy and generalizability
Abstract
The Worldwide LHC Computing Grid (WLCG) provides the robust computing infrastructure essential for the LHC experiments by integrating global computing resources into a cohesive entity. Simulations of different compute models present a feasible approach for evaluating future adaptations that are able to cope with future increased demands. However, running these simulations incurs a trade-off between accuracy and scalability. For example, while the simulator DCSim can provide accurate results, it falls short on scaling with the size of the simulated platform. Using Generative Machine Learning as a surrogate presents a candidate for overcoming this challenge. In this work, we evaluate the usage of three different Machine Learning models for the simulation of distributed computing systems and assess their ability to generalize to unseen situations. We show that those models can predict…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Simulation Techniques and Applications
