Topic Modeling in New Physics Detection
Alexandre Alves, Eduardo da Silva Almeida, Douglas Roberto Pimentel

TL;DR
This paper explores the use of topic modeling as an unsupervised method to detect new physics signals in proton-proton collision data at the LHC, showing competitive performance against established outlier detection techniques.
Contribution
It introduces a novel application of topic modeling for new physics detection in low-multiplicity events, without relying on jet substructure variables.
Findings
Topic modeling performs well even with low particle multiplicity.
It outperforms or matches established outlier detectors like isolation forest and VAEs.
The method is effective across multiple new physics scenarios.
Abstract
In this work, we apply topic modeling to detect new physics in proton-proton collisions at the LHC in an unsupervised way. We investigate three new physics scenarios where fully leptonic is the main source of background without relying on jet substructure variables. We demonstrate that the algorithm remains effective even in this low-particle multiplicity framework, complementing jet tagging studies, where it is typically employed. Moreover, we demonstrate that the performance of topic modeling is competitive or even better than well-known outlier detectors, such as isolation forest and variational autoencoders, with moderate and high background pollution in almost all new physics scenarios considered.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParticle physics theoretical and experimental studies · High-Energy Particle Collisions Research · Computational Physics and Python Applications
