TL;DR
The paper introduces PHAT-JeT, a hierarchical transformer model that combines geometric message passing and patch-based attention to achieve high-accuracy jet tagging under strict computational constraints.
Contribution
It proposes a novel hierarchical transformer architecture that efficiently encodes local and global features for particle jet tagging, outperforming existing models under resource limits.
Findings
Achieves state-of-the-art accuracy on four jet tagging benchmarks.
Maintains high background rejection within limited computational budgets.
Outperforms existing resource-constrained models in efficiency and accuracy.
Abstract
Real-time jet tagging is critical for identifying short-lived particle decays in the high-throughput detectors of the Large Hadron Collider, where real-time trigger systems responsible for deciding which collision events to store impose strict latency and accuracy constraints. While transformer architectures achieve the highest jet tagging accuracy when compute is unconstrained, their quadratic self-attention cost makes inference restrictive on trigger budget. Existing efficient variants reduce the computational cost, but hinder the classification performance. To address this limitation, we introduce the Patch Hierarchical Attention Transformer (PHAT-JeT), which combines two mechanisms: a physics-inspired geometric message-passing module that encodes local detector-plane structure, and a hierarchical patch-based attention scheme that computes exact attention within small particle groups…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
