Support Vector Machine Classification on a Biased Training Set: Multi-Jet Background Rejection at Hadron Colliders
Federico Sforza, Vittorio Lippi

TL;DR
This paper presents a novel method for optimizing Support Vector Machine classifiers using biased training data, incorporating feedback from a template fit, to improve multi-jet background rejection in collider experiments.
Contribution
It introduces a new approach that combines template fitting feedback with SVM optimization to handle biased training samples in collider data analysis.
Findings
Achieved superior background rejection performance compared to previous methods.
Successfully applied the method to real collider data from the CDF experiment.
Demonstrated effective variable selection and training on mixed data sources.
Abstract
This paper describes an innovative way to optimize a multivariate classifier, in particular a Support Vector Machine algorithm, on a problem characterized by a biased training sample. This is possible thanks to the feedback of a signal-background template fit performed on a validation sample and included both in the optimization process and in the input variable selection. The procedure is applied to a real case of interest at hadron collider experiments: the reduction and the estimate of the multi-jet background in the plus jets data sample collected by the CDF experiment. The training samples, partially derived from data and partially from simulation, are described in detail together with the input variables exploited for the classification. At present, the reached performance is superior to any other prescription applied to the same final state at hadron collider…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
