Leveraging universality of jet taggers through transfer learning

Fr\'ed\'eric A. Dreyer; Rados{\l}aw Grabarczyk; Pier Francesco; Monni

arXiv:2203.06210·hep-ph·July 13, 2022

Leveraging universality of jet taggers through transfer learning

Fr\'ed\'eric A. Dreyer, Rados{\l}aw Grabarczyk, Pier Francesco, Monni

PDF

TL;DR

This paper demonstrates that transfer learning can significantly reduce training data and time requirements for jet taggers in collider physics by leveraging the universality of QCD, enabling faster and more data-efficient models.

Contribution

It introduces transfer learning methods for jet taggers, showing how to adapt existing models to new signals with less data and training time, using fine-tuning or freezing weights.

Findings

01

Reliable taggers with ten times less data.

02

Training speed increased by up to a factor of three.

03

Transfer learning reduces computational costs in collider experiments.

Abstract

A significant challenge in the tagging of boosted objects via machine-learning technology is the prohibitive computational cost associated with training sophisticated models. Nevertheless, the universality of QCD suggests that a large amount of the information learnt in the training is common to different physical signals and experimental setups. In this article, we explore the use of transfer learning techniques to develop fast and data-efficient jet taggers that leverage such universality. We consider the graph neural networks LundNet and ParticleNet, and introduce two prescriptions to transfer an existing tagger into a new signal based either on fine-tuning all the weights of a model or alternatively on freezing a fraction of them. In the case of $W$ -boson and top-quark tagging, we find that one can obtain reliable taggers using an order of magnitude less data with a corresponding…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.