Scaling MadMiner with a deployment on REANA
Irina Espejo, Sinclert P\'erez, Kenyi Hurtado, Lukas Heinrich, Kyle, Cranmer

TL;DR
This paper demonstrates how to deploy MadMiner, a multivariate inference tool for high-energy physics, on the REANA platform, achieving scalable performance and significantly reducing analysis time.
Contribution
It introduces a reproducible, scalable workflow for MadMiner using YAML and REANA, enabling efficient large-scale physics analyses.
Findings
Linear scaling of physics sub-workflow with resources
Workflow processes 11 million events in 5 hours
Deployment enhances accessibility and reproducibility
Abstract
MadMiner is a Python package that implements a powerful family of multivariate inference techniques that leverage matrix element information and machine learning. This multivariate approach neither requires the reduction of high-dimensional data to summary statistics nor any simplifications to the underlying physics or detector response. In this paper, we address some of the challenges arising from deploying MadMiner in a real-scale HEP analysis with the goal of offering a new tool in HEP that is easily accessible. The proposed approach encapsulates a typical MadMiner pipeline into a parametrized yadage workflow described in YAML files. The general workflow is split into two yadage sub-workflows, one dealing with the physics simulations and the other with the ML inference. After that, the workflow is deployed using REANA, a reproducible research data analysis platform that takes care of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Physics and Python Applications · Machine Learning in Materials Science
