An investigation of over-training within semi-supervised machine learning models in the search for heavy resonances at the LHC
Benjamin Lieberman, Joshua Choma, Salah-Eddine Dahbi, Bruce Mellado,, Xifeng Ruan

TL;DR
This paper investigates the risk of over-training in semi-supervised machine learning models used for detecting heavy resonances at the LHC, highlighting the potential for false signals due to over-fitting.
Contribution
It provides a quantitative analysis of false signal generation caused by over-training in semi-supervised models using toy Monte Carlo simulations.
Findings
Over-training can lead to false signals in semi-supervised models.
Quantification of false signal probability due to over-fitting.
Analysis performed on background-only Monte Carlo samples.
Abstract
In particle physics, semi-supervised machine learning is an attractive option to reduce model dependencies searches beyond the Standard Model. When utilizing semi-supervised techniques in training machine learning models in the search for bosons at the Large Hadron Collider, the over-training of the model must be investigated. Internal fluctuations of the phase space and bias in training can cause semi-supervised models to label false signals within the phase space due to over-fitting. The issue of false signal generation in semi-supervised models has not been fully analyzed and therefore utilizing a toy Monte Carlo model, the probability of such situations occurring must be quantified. This investigation of resonances is performed using a pure background Monte Carlo sample. Through unique pure background samples extracted to mimic ATLAS data in a background-plus-signal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParticle physics theoretical and experimental studies · Particle Detector Development and Performance · High-Energy Particle Collisions Research
