TL;DR
This paper introduces a simple, machine-learning-based method to estimate statistical significance in high-energy physics experiments, improving accuracy over traditional counting methods especially in high-dimensional data scenarios.
Contribution
It presents a novel approach combining machine learning with likelihood-based inference to approximate optimal statistical significance in complex data analyses.
Findings
The method outperforms naive counting in approximating true significance.
It is effective with high-dimensional data where traditional methods struggle.
Application to LHC data demonstrates improved sensitivity estimates.
Abstract
Machine-learning techniques have become fundamental in high-energy physics and, for new physics searches, it is crucial to know their performance in terms of experimental sensitivity, understood as the statistical significance of the signal-plus-background hypothesis over the background-only one. We present here a simple method that combines the power of current machine-learning techniques to face high-dimensional data with the likelihood-based inference tests used in traditional analyses, which allows us to estimate the sensitivity for both discovery and exclusion limits through a single parameter of interest, the signal strength. Based on supervised learning techniques, it can perform well also with high-dimensional data, when traditional techniques cannot. We apply the method to a toy model first, so we can explore its potential, and then to a LHC study of new physics particles in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
