TL;DR
This paper systematically compares various jet tagging methods that aim to identify signals while preserving background distribution shapes, highlighting data augmentation techniques that are effective and computationally efficient.
Contribution
It introduces and evaluates data augmentation techniques like Planing and PCA scaling, demonstrating their comparable performance to more complex methods in jet tagging.
Findings
Data augmentation techniques perform similarly to adversarial training methods.
Planing and PCA scaling are easier to implement and computationally cheaper.
Methods effectively balance signal identification with background shape preservation.
Abstract
Searching for new physics in large data sets needs a balance between two competing effects---signal identification vs background distortion. In this work, we perform a systematic study of both single variable and multivariate jet tagging methods that aim for this balance. The methods preserve the shape of the background distribution by either augmenting the training procedure or the data itself. Multiple quantitative metrics to compare the methods are considered, for tagging 2-, 3-, or 4-prong jets from the QCD background. This is the first study to show that the data augmentation techniques of Planing and PCA based scaling deliver similar performance as the augmented training techniques of Adversarial NN and uBoost, but are both easier to implement and computationally cheaper.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
