TL;DR
This paper reviews and introduces metrics for evaluating generative models in high energy physics, proposing new physics distances and demonstrating their effectiveness in distinguishing models and datasets.
Contribution
It systematically reviews evaluation metrics for generative models in HEP, introduces two new physics distances, and validates their effectiveness through experiments and model comparisons.
Findings
FPD is highly sensitive to various jet distribution failures.
The proposed metrics outperform traditional ones in detecting discrepancies.
The JetNet Python library implements these new evaluation metrics.
Abstract
There has been a recent explosion in research into machine-learning-based generative modeling to tackle computational challenges for simulations in high energy physics (HEP). In order to use such alternative simulators in practice, we need well-defined metrics to compare different generative models and evaluate their discrepancy from the true distributions. We present the first systematic review and investigation into evaluation metrics and their sensitivity to failure modes of generative models, using the framework of two-sample goodness-of-fit testing, and their relevance and viability for HEP. Inspired by previous work in both physics and computer vision, we propose two new metrics, the Fr\'echet and kernel physics distances (FPD and KPD, respectively), and perform a variety of experiments measuring their performance on simple Gaussian-distributed, and simulated high energy jet…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
