LeGrad: An Explainability Method for Vision Transformers via Feature Formation Sensitivity
Walid Bousselham, Angie Boggust, Sofian Chaybouti, Hendrik Strobelt, and Hilde Kuehne

TL;DR
LeGrad is a new explainability method for Vision Transformers that uses gradient signals from attention maps to produce transparent and robust visual explanations, outperforming existing methods in various tasks.
Contribution
LeGrad introduces a simple, effective gradient-based explainability technique tailored for ViTs, aggregating signals across layers for improved interpretability.
Findings
LeGrad achieves higher spatial fidelity than existing methods.
It demonstrates robustness to perturbations in explanations.
LeGrad performs well across segmentation, perturbation, and open-vocabulary tasks.
Abstract
Vision Transformers (ViTs), with their ability to model long-range dependencies through self-attention mechanisms, have become a standard architecture in computer vision. However, the interpretability of these models remains a challenge. To address this, we propose LeGrad, an explainability method specifically designed for ViTs. LeGrad computes the gradient with respect to the attention maps of ViT layers, considering the gradient itself as the explainability signal. We aggregate the signal over all layers, combining the activations of the last as well as intermediate tokens to produce the merged explainability map. This makes LeGrad a conceptually simple and an easy-to-implement tool for enhancing the transparency of ViTs. We evaluate LeGrad in challenging segmentation, perturbation, and open-vocabulary settings, showcasing its versatility compared to other SotA explainability methods…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Anomaly Detection Techniques and Applications · Adversarial Robustness in Machine Learning
