Exploring the Space of Jets with CMS Open Data
Patrick T. Komiske, Radha Mastandrea, Eric M. Metodiev, Preksha Naik,, and Jesse Thaler

TL;DR
This paper analyzes the space of jets using CMS Open Data, validating detector performance, comparing with simulations, and employing the energy mover's distance to explore jet configurations and their metric space.
Contribution
It introduces novel analyses of jet metric space using EMD on CMS Open Data, providing datasets and code for future research.
Findings
Good agreement between data and simulations for track-based observables
EMD effectively quantifies detector effects and jet similarities
Identifies typical and atypical jet configurations
Abstract
We explore the metric space of jets using public collider data from the CMS experiment. Starting from 2.3/fb of 7 TeV proton-proton collisions collected at the Large Hadron Collider in 2011, we isolate a sample of 1,690,984 central jets with transverse momentum above 375 GeV. To validate the performance of the CMS detector in reconstructing the energy flow of jets, we compare the CMS Open Data to corresponding simulated data samples for a variety of jet kinematic and substructure observables. Even without detector unfolding, we find very good agreement for track-based observables after using charged hadron subtraction to mitigate the impact of pileup. We perform a range of novel analyses, using the "energy mover's distance" (EMD) to measure the pairwise difference between jet energy flows. The EMD allows us to quantify the impact of detector effects, visualize the metric space of jets,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
