Non-Parametric Data-Driven Background Modelling using Conditional Probabilities
A. Chisholm, T. Neep, K. Nikolopoulos, R. Owen, E. Reynolds, J. Silva

TL;DR
This paper introduces two innovative non-parametric, data-driven background modelling techniques for particle physics, addressing limitations of simulations and parametric models, especially with large datasets like those from CERN LHC.
Contribution
It presents two novel methods—ancestral sampling and GAN-based modelling—for flexible, accurate background estimation without relying on simulations or predefined functional forms.
Findings
Both methods effectively model backgrounds in benchmark analyses.
The GAN approach captures complex correlations in data.
The ancestral sampling method provides reliable background estimates.
Abstract
Background modelling is one of the main challenges in particle physics data analysis. Commonly employed strategies include the use of simulated events of the background processes, and the fitting of parametric background models to the observed data. However, reliable simulations are not always available or may be extremely costly to produce. As a result, in many cases, uncertainties associated with the accuracy or sample size of the simulation are the limiting factor in the analysis sensitivity. At the same time, parametric models are limited by the a priori unknown functional form and parameter values of the background distribution. These issues become ever more pressing when large datasets become available, as it is already the case at the CERN Large Hadron Collider, and when studying exclusive signatures involving hadronic backgrounds. Two novel and widely applicable non-parametric…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
