Transformer networks for Heavy flavor jet tagging
A. Hammad, Mihoko M Nojiri

TL;DR
This paper reviews the application of Transformer networks in heavy flavor jet tagging at high-energy colliders, highlighting performance improvements, interpretability, and physics-inspired modifications of deep learning models.
Contribution
It introduces the use of attention-based Transformer networks for heavy flavor jet tagging and discusses enhancements based on physics insights and interpretability methods.
Findings
Transformer networks improve jet tagging accuracy
Physics-inspired modifications enhance model performance
Interpretable methods aid understanding of network decisions
Abstract
In this article, we review recent machine learning methods used in challenging particle identification of heavy-boosted particles at high-energy colliders. Our primary focus is on attention-based Transformer networks. We report the performance of state-of-the-art deep learning networks and further improvement coming from the modification of networks based on physics insights. Additionally, we discuss interpretable methods to understand network decision-making, which are crucial when employing highly complex and deep networks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGraphite, nuclear technology, radiation studies · Nuclear reactor physics and engineering · Radiation Effects in Electronics
