FM4NPP: A Scaling Foundation Model for Nuclear and Particle Physics
David Park, Shuhang Li, Yi Huang, Xihaier Luo, Haiwang Yu, Yeonju Go, Christopher Pinkenburg, Yuewei Lin, Shinjae Yoo, Joseph Osborn, Jin Huang, Yihui Ren

TL;DR
This paper introduces FM4NPP, a large-scale foundation model for particle physics that leverages a new dataset and self-supervised training to outperform baselines and adapt efficiently across diverse tasks.
Contribution
The work presents a novel self-supervised training method, a large dataset, and demonstrates the scalability and task generalization of a foundation model in experimental particle physics.
Findings
Model with up to 188 million parameters outperforms baselines
Task-agnostic representations can be specialized with a linear mapping
Model exhibits robust data-efficient adaptation
Abstract
Large language models have revolutionized artificial intelligence by enabling large, generalizable models trained through self-supervision. This paradigm has inspired the development of scientific foundation models (FMs). However, applying this capability to experimental particle physics is challenging due to the sparse, spatially distributed nature of detector data, which differs dramatically from natural language. This work addresses if an FM for particle physics can scale and generalize across diverse tasks. We introduce a new dataset with more than 11 million particle collision events and a suite of downstream tasks and labeled data for evaluation. We propose a novel self-supervised training method for detector data and demonstrate its neural scalability with models that feature up to 188 million parameters. With frozen weights and task-specific adapters, this FM consistently…
Peer Reviews
Decision·ICLR 2026 Poster
The authors present detailed technical considerations for building the foundation model, though there are shortcomings on specific design and experiment choices.
- The manuscript does not adequately situate its contribution within the emerging landscape of foundation models for high-energy physics. It fails to discuss or compare against other recent efforts (e.g., models based on transformers, graph networks, or other architectures applied to similar data). - The datasets used appear limited in scope, focusing on a specific collision system and energy. How does this restricted training data affect the model's validity and utility as a general-purpose fou
- The paper is well-organized, includes a comprehensive literature review, and provides high quality figures/diagrams. - The paper's stated contributions are clear and address a difficult, high-impact problem in the NPP domain. The approach potentially lays the groundwork for tackling broader challenges relevant to scientific foundation models beyond particle physics settings. - The reported evaluation is extensive, covering a variety of important dimensions that help position the model's utilit
- I find the core positioning of the paper to be somewhat difficult to pin down. There is a clear focus on particle physics as the primary domain of interest, but the proposed training scheme for scaling and adapting FMs to downstream tasks is very general. To fully demonstrate the generality of the scheme, more (scientific) domains would need to be evaluated to verify the approach transfers well to other settings (helping substantiate the stated hypothesis that FMs encode task-agnostic represen
* The authors present a well motivated problem that they address with a multi-step comprehensive framework * The self-supervised learning objective is physics motivated, and intuitively works with the detector setup * Very thorough benchmarking and ablation studies show improved performance for all pieces of the foundation model * Performance on downstream task show greatly increased performance over baselines * Scaling curves show increased performance for increased flops/parameter count, indi
* Does not justify choice of mamba over linear transformer models, or other state space model alternatives * Unclear how the hierarchical raster scan would impact the efficiency * No benchmarks against efficient transformer baselines such as HEP-T (locality sensitive hashing transformer for high energy physics applications, https://arxiv.org/abs/2402.12535), which performs very well on trackML datasets * For PID, also missing comparisons to dedicated physics ML algorithms, like MLPF (https://arx
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParticle physics theoretical and experimental studies
