Generalizable Foundation Models for Calorimetry via Mixtures-of-Experts and Parameter Efficient Fine Tuning

Carlos Cardona-Giraldo; Cristiano Fanelli; James Giroux; Cole Granger; Benjamin Nachman; Gerald Sabin

arXiv:2603.28804·physics.ins-det·April 1, 2026

Generalizable Foundation Models for Calorimetry via Mixtures-of-Experts and Parameter Efficient Fine Tuning

Carlos Cardona-Giraldo, Cristiano Fanelli, James Giroux, Cole Granger, Benjamin Nachman, Gerald Sabin

PDF

TL;DR

This paper presents a generalizable, efficient foundation model for calorimetry in particle physics, leveraging transformer architectures, Mixture-of-Experts, and modular fine-tuning to adapt across materials, particles, and configurations.

Contribution

It introduces a novel next-token transformer-based model with Mixture-of-Experts and parameter-efficient fine-tuning for scalable, incremental learning in calorimeter simulations.

Findings

01

Model supports modular adaptation to new materials and particles.

02

Achieves competitive performance with standard generative models.

03

Enables incremental knowledge integration without catastrophic forgetting.

Abstract

Modern particle physics experiments face an increasing demand for high-fidelity detector simulation as luminosities rise and computational requirements approach the limits of available resources. Deep generative models have emerged as promising surrogates for traditional Monte Carlo simulation, with recent advances drawing inspiration from large language models (LLM) and next-token prediction paradigms. In this work, we introduce a generalizable foundation model for calorimetry built on next-token transformer backbones, designed to support modular adaptation across materials, particle species, and detector configurations. Our approach combines Mixture-of-Experts pre-training with parameter-efficient fine-tuning strategies to enable controlled, additive model expansion without catastrophic forgetting. A pre-trained backbone is trained to generate electromagnetic showers across multiple…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.