Sub-microsecond Transformers for Jet Tagging on FPGAs
Lauri Laatu, Chang Sun, Arianna Cox, Abhijith Gandrakota, Benedikt Maier, Jennifer Ngadiuba, Zhiqiang Que, Wayne Luk, Maria Spiropulu, Alexander Tapper

TL;DR
This paper introduces the first sub-microsecond FPGA implementation of transformers for jet tagging, enabling real-time high-energy physics applications with high performance and low latency.
Contribution
It demonstrates a novel FPGA-based transformer model for jet tagging, achieving sub-microsecond latency and integrating advanced attention mechanisms for real-time physics experiments.
Findings
Achieved ~100 nanosecond latency on FPGA for jet tagging
Fitted entire transformer model on a single FPGA using quantization and optimization
Enhanced hls4ml with multi-head and linear attention support
Abstract
We present the first sub-microsecond transformer implementation on an FPGA achieving competitive performance for state-of-the-art high-energy physics benchmarks. Transformers have shown exceptional performance on multiple tasks in modern machine learning applications, including jet tagging at the CERN Large Hadron Collider (LHC). However, their computational complexity prohibits use in real-time applications, such as the hardware trigger system of the collider experiments up until now. In this work, we demonstrate the first application of transformers for jet tagging on FPGAs, achieving nanosecond latency with superior performance compared to alternative baseline models. We leverage high-granularity quantization and distributed arithmetic optimization to fit the entire transformer model on a single FPGA, achieving the required throughput and latency. Furthermore, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
