IAFormer: Interaction-Aware Transformer network for collider data analysis
W. Esmail, A. Hammad, M. Nojiri

TL;DR
IAFormer is a new Transformer architecture for collider data analysis that uses sparse attention and physics-inspired features to improve efficiency and performance.
Contribution
It introduces a novel sparse attention mechanism based on physical quantities, reducing model complexity while maintaining state-of-the-art accuracy.
Findings
Achieves state-of-the-art classification accuracy on top and quark-gluon datasets.
Reduces computational complexity by more than an order of magnitude compared to previous models.
Effectively captures physically meaningful information through its attention mechanism.
Abstract
In this paper, we introduce \texttt{IAFormer}, a novel Transformer-based architecture that efficiently integrates pairwise particle interactions through a dynamic sparse attention mechanism. \texttt{IAFormer} has two new mechanisms within the model. First, the attention matrix depends on predefined boost invariant pairwise quantities, reducing the network parameters significantly from the original particle transformer models. Second, \texttt{IAFormer} incorporates the sparse attention mechanism by utilizing the "differential attention", so that it can dynamically prioritize relevant particle tokens while reducing computational overhead associated with less informative ones. This approach significantly lowers the model complexity without compromising performance. Despite being computationally efficient by more than an order of magnitude than the Particle Transformer network,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
