Can Transformer Attention Spread Give Insights Into Uncertainty of Detected and Tracked Objects?
Felicia Ruppel, Florian Faion, Claudius Gl\"aser, Klaus Dietmayer

TL;DR
This paper explores how transformer attention weights in object detection and tracking models can provide insights into the uncertainty of detected objects, especially in unstructured or novel environments.
Contribution
It investigates the distribution and evolution of attention weights across decoder layers and object lifetime, proposing their use as indicators of detection uncertainty.
Findings
Attention weights vary across decoder layers and object lifetime.
Attention distributions can reflect detection confidence levels.
Attention-based uncertainty measures improve reliability in novel environments.
Abstract
Transformers have recently been utilized to perform object detection and tracking in the context of autonomous driving. One unique characteristic of these models is that attention weights are computed in each forward pass, giving insights into the model's interior, in particular, which part of the input data it deemed interesting for the given task. Such an attention matrix with the input grid is available for each detected (or tracked) object in every transformer decoder layer. In this work, we investigate the distribution of these attention weights: How do they change through the decoder layers and through the lifetime of a track? Can they be used to infer additional information about an object, such as a detection uncertainty? Especially in unstructured environments, or environments that were not common during training, a reliable measure of detection uncertainty is crucial to decide…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · CCD and CMOS Imaging Sensors
