Speeding up Madgraph5 aMC@NLO through CPU vectorization and GPU offloading: towards a first alpha release
Andrea Valassi, Taylor Childers, Laurence Field, Stephan Hageb\"ock,, Walter Hopkins, Olivier Mattelaer, Nathan Nichols, Stefan Roiser, David, Smith, Jorgen Teig, Carl Vuosalo, Zenny Wettersten

TL;DR
This paper discusses efforts to accelerate the Madgraph5 aMC@NLO event generator by implementing CPU vectorization and GPU offloading for matrix element calculations, aiming for a first alpha release to improve LHC physics simulations.
Contribution
It introduces the reengineering of Madgraph5 aMC@NLO with vectorized C++, CUDA, and SYCL implementations for matrix element calculations, enhancing computational efficiency.
Findings
Successful implementation of vectorized C++ code for ME calculations
Development of CUDA and SYCL versions for GPU acceleration
Progress towards an alpha release supporting QCD LO processes
Abstract
The matrix element (ME) calculation in any Monte Carlo physics event generator is an ideal fit for implementing data parallelism with lockstep processing on GPUs and vector CPUs. For complex physics processes where the ME calculation is the computational bottleneck of event generation workflows, this can lead to large overall speedups by efficiently exploiting these hardware architectures, which are now largely underutilized in HEP. In this paper, we present the status of our work on the reengineering of the Madgraph5_aMC@NLO event generator at the time of the ACAT2022 conference. The progress achieved since our previous publication in the ICHEP2022 proceedings is discussed, for our implementations of the ME calculations in vectorized C++, in CUDA and in the SYCL framework, as well as in their integration into the existing MadEvent framework. The outlook towards a first alpha release of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Storage Technologies · Distributed and Parallel Computing Systems · Parallel Computing and Optimization Techniques
