Mind the Gap: Navigating Inference with Optimal Transport Maps
Malte Algren, Tobias Golling, Francesco Armando Di Bello, Christopher Pollard

TL;DR
This paper introduces a novel calibration method using optimal transport to address simulation-data discrepancies in particle physics, enabling more accurate high-dimensional modeling for downstream analysis.
Contribution
It presents the first application of optimal transport-based calibration to high-dimensional simulations in particle physics, improving the reliability of ML models.
Findings
Calibrated jet representations improve downstream task accuracy.
The method effectively corrects high-dimensional simulation discrepancies.
Enables unbiased use of foundation models in particle physics.
Abstract
Machine learning (ML) techniques have recently enabled enormous gains in sensitivity to new phenomena across the sciences. In particle physics, much of this progress has relied on excellent simulations of a wide range of physical processes. However, due to the sophistication of modern machine learning algorithms and their reliance on high-quality training samples, discrepancies between simulation and experimental data can significantly limit their effectiveness. In this work, we present a solution to this ``misspecification'' problem: a model calibration approach based on optimal transport, which we apply to high-dimensional simulations for the first time. We demonstrate the performance of our approach through jet tagging, using a dataset inspired by the CMS experiment at the Large Hadron Collider. A 128-dimensional internal jet representation from a powerful general-purpose classifier…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
