CaloArt: Large-Patch x-Prediction Diffusion Transformers for High-Granularity Calorimeter Shower Generation
Zhengkun Huang, Gongxing Sun

TL;DR
CaloArt introduces a diffusion transformer model with large-patch tokenization and x-prediction for efficient high-granularity calorimeter shower simulation, achieving state-of-the-art quality with low computational cost.
Contribution
The paper presents CaloArt, a novel 3D diffusion transformer with large-patch tokenization and x-prediction, improving efficiency and quality in calorimeter shower generation.
Findings
CaloArt achieves the best FPD and high-level metrics on Dataset 2.
X-prediction outperforms v-prediction on Dataset 3.
Models generate showers in approximately 10 ms per shower on a single GPU.
Abstract
High-granularity calorimeters make ML-based fast shower simulation a high-dimensional generative modeling problem, where voxel-space generators must balance physics fidelity with training and inference cost. This work studies large-patch tokenization with x-prediction, enabling efficient raw voxel generation. We propose CaloArt, a modernized DiT-style backbone with 3D positional encoding and architectural refinements, trained via conditional flow matching with decoupled prediction and loss spaces. On CaloChallenge Dataset 2, where small patch size remains affordable, v-prediction performs well, and CaloArt achieves the best FPD, strongest high-level metrics, and strongest ResNet classifier metrics. On CaloChallenge Dataset 3, the 40500-voxel grid makes large patches necessary; x-prediction improves all reported metrics over v-prediction and places CaloArt on the quality-generation-time…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
