Machine Learning on sWeighted Data
Maxim Borisyak, Nikita Kazeev

TL;DR
This paper introduces a mathematically rigorous method to convert sPlot weights, which can be negative, into valid class probabilities, facilitating the application of machine learning techniques in high energy physics data analysis.
Contribution
The paper proposes a novel transformation of sPlot weights into probabilities, overcoming the challenge of negative weights for machine learning applications.
Findings
Enables neural network training with sPlot-derived data
Ensures weights are valid probabilities for ML models
Facilitates broader application of ML in high energy physics
Abstract
Data analysis in high energy physics has to deal with data samples produced from different sources. One of the most widely used ways to unfold their contributions is the sPlot technique. It uses the results of a maximum likelihood fit to assign weights to events. Some weights produced by sPlot are by design negative. Negative weights make it difficult to apply machine learning methods. The loss function becomes unbounded. This leads to divergent neural network training. In this paper we propose a mathematically rigorous way to transform the weights obtained by sPlot into class probabilities conditioned on observables, thus enabling to apply any machine learning algorithm out-of-the-box.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
