Les Houches guide to reusable ML models in LHC analyses
Jack Y. Araz, Andy Buckley, Gregor Kasieczka, Jan Kieseler, Sabine, Kraml, Anders Kvellestad, Andre Lessa, Tomasz Procter, Are Raklev, Humberto, Reyes-Gonzalez, Krzysztof Rolbiecki, Sezen Sekmen, Gokhan Unel

TL;DR
This paper discusses the challenges and strategies for making machine-learning models in high-energy physics analyses reusable and trustworthy, emphasizing technical and strategic solutions for analysis preservation.
Contribution
It provides a comprehensive overview of practical issues and promising approaches for preserving and reusing ML models in LHC analyses.
Findings
Highlights practical issues in ML model reporting and stability.
Identifies promising technical solutions for model preservation.
Emphasizes strategic approaches for trustworthy analysis reuse.
Abstract
With the increasing usage of machine-learning in high-energy physics analyses, the publication of the trained models in a reusable form has become a crucial question for analysis preservation and reuse. The complexity of these models creates practical issues for both reporting them accurately and for ensuring the stability of their behaviours in different environments and over extended timescales. In this note we discuss the current state of affairs, highlighting specific practical issues and focusing on the most promising technical and strategic approaches to ensure trustworthy analysis-preservation. This material originated from discussions in the LHC Reinterpretation Forum and the 2023 PhysTeV workshop at Les Houches.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Storage Technologies · Distributed and Parallel Computing Systems · Particle physics theoretical and experimental studies
