SURFing to the Fundamental Limit of Jet Tagging
Ian Pang, Darius A. Faroughy, David Shih, Ranit Das, Gregor Kasieczka

TL;DR
This paper introduces the SURF method to validate generative models for jet tagging, demonstrating that current models are near the performance limit, while cautioning against overestimations by autoregressive models.
Contribution
The paper presents the SURF framework for validating generative models in jet tagging, establishing a method to assess their proximity to fundamental performance limits.
Findings
Modern jet taggers operate close to the true statistical limit.
Autoregressive GPT models overstate separation power, misleading the fundamental limit.
SURF provides a reliable validation of generative models for jet classification.
Abstract
Beyond the practical goal of improving search and measurement sensitivity through better jet tagging algorithms, there is a deeper question: what are their upper performance limits? Generative surrogate models with learned likelihood functions offer a new approach to this problem, provided the surrogate correctly captures the underlying data distribution. In this work, we introduce the SUrrogate ReFerence (SURF) method, a new approach to validating generative models. This framework enables exact Neyman-Pearson tests by training the target model on samples from another tractable surrogate, which is itself trained on real data. We argue that the EPiC-FM generative model is a valid surrogate reference for JetClass jets and apply SURF to show that modern jet taggers may already be operating close to the true statistical limit. By contrast, we find that autoregressive GPT models unphysically…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParticle physics theoretical and experimental studies · Computational Physics and Python Applications · Generative Adversarial Networks and Image Synthesis
