Machine Learning as a Service for HEP
Valentin Kuznetsov

TL;DR
This paper proposes a Machine Learning as a Service (MLaaS) framework tailored for High-Energy Physics, enabling scalable training and deployment of ML models directly on CERN's large ROOT data format using distributed infrastructure.
Contribution
It introduces a modular MLaaS architecture that reads ROOT data, leverages WLCG for remote access, and serves models via HTTP, facilitating large-scale training and easy deployment in HEP.
Findings
Supports native ROOT data format processing
Enables remote data access via WLCG infrastructure
Provides HTTP-based model serving for integration
Abstract
Machine Learning (ML) will play significant role in success of the upcoming High-Luminosity LHC (HL-LHC) program at CERN. The unprecedented amount of data at the Exa-Byte scale to be collected by the CERN experiments in next decade will require a novel approaches to train and use ML models. In this paper we discuss Machine Learning as a Service (MLaaS) model which is capable to read HEP data in their native ROOT data format, rely on the World-Wide LHC Grid (WLCG) infrastructure for remote data access, and serve a pre-trained model via HTTP protocol. Such modular design opens up a possibility to train data at large scale by reading ROOT files from remote storages, avoiding data-transformation to flatten data formats currently used by ML frameworks, and easily access pre-trained ML models in existing infrastructure and applications.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Scientific Computing and Data Management · Advanced Data Storage Technologies
