# Machine Learning on data with sPlot background subtraction

**Authors:** Maxim Borisyak, Nikita Kazeev

arXiv: 1905.11719 · 2019-12-03

## TL;DR

This paper introduces a mathematically rigorous method to train machine learning algorithms on data with background subtraction using sPlot, avoiding negative weights and enabling the use of standard ML techniques in high energy physics analyses.

## Contribution

It proposes a novel approach to train ML models on sPlot background-subtracted data without negative weights, improving robustness and applicability.

## Key findings

- Enables training of ML models without negative event weights
- Maintains accurate signal probability estimation
- Compatible with any off-the-shelf machine learning method

## Abstract

Data analysis in high energy physics often deals with data samples consisting of a mixture of signal and background events. The sPlot technique is a common method to subtract the contribution of the background by assigning weights to events. Part of the weights are by design negative. Negative weights lead to the divergence of some machine learning algorithms training due to absence of the lower bound in the loss function. In this paper we propose a mathematically rigorous way to train machine learning algorithms on data samples with background described by sPlot to obtain signal probabilities conditioned on observables, without encountering negative event weight at all. This allows usage of any out-of-the-box machine learning methods on such data.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1905.11719/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/1905.11719/full.md

## References

11 references — full list in the complete paper: https://tomesphere.com/paper/1905.11719/full.md

---
Source: https://tomesphere.com/paper/1905.11719