Isolating Unisolated Upsilons with Anomaly Detection in CMS Open Data

Rikab Gambhir; Radha Mastandrea; Benjamin Nachman; Jesse Thaler

arXiv:2502.14036·hep-ph·August 15, 2025

Isolating Unisolated Upsilons with Anomaly Detection in CMS Open Data

Rikab Gambhir, Radha Mastandrea, Benjamin Nachman, Jesse Thaler

PDF

1 Repo

TL;DR

This paper demonstrates that machine learning-based anomaly detection can effectively identify Upsilon decays in CMS open data, significantly improving signal significance over traditional methods and enabling practical discovery in collider experiments.

Contribution

The study introduces a novel ML-based anomaly detection approach for isolating Upsilon signals in collider data, surpassing traditional techniques and providing a benchmark dataset for future research.

Findings

01

Achieved a 6.4 sigma significance for Upsilon detection

02

Demonstrated ML-based methods outperform cut-and-count approaches

03

Provided a benchmark dataset for anomaly detection in collider data

Abstract

We present the first study of anti-isolated Upsilon decays to two muons ( $Υ \to μ^{+} μ^{-}$ ) in proton-proton collisions at the Large Hadron Collider. Using a machine learning (ML)-based anomaly detection strategy, we "rediscover" the $Υ$ in 13 TeV CMS Open Data from 2016, despite overwhelming anti-isolated backgrounds. We elevate the signal significance to $6.4 σ$ using these methods, starting from $1.6 σ$ using the dimuon mass spectrum alone. Moreover, we demonstrate improved sensitivity from using an ML-based estimate of the multi-feature likelihood compared to traditional "cut-and-count" methods. Our work demonstrates that it is possible and practical to find real signals in experimental collider data using ML-based anomaly detection, and we distill a readily-accessible benchmark dataset from the CMS Open Data to facilitate future anomaly detection…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hep-lbdl/dimuonad
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.