Class-Discriminative Attention Maps for Vision Transformers
Lennart Brocki, Jakub Binda, Neo Christopher Chung

TL;DR
This paper introduces class-discriminative attention maps (CDAM), a gradient-based method for explaining vision transformers by highlighting features relevant to specific classes or concepts, improving interpretability and relevance of explanations.
Contribution
The paper proposes CDAM, a novel gradient-based extension of attention maps for vision transformers that enhances class sensitivity and semantic relevance of explanations.
Findings
CDAM outperforms 7 other importance estimators on correctness, compactness, and class sensitivity.
Smooth and Integrated CDAM variants further improve explanation quality.
CDAM effectively explains medical image models for lung CT scans.
Abstract
Importance estimators are explainability methods that quantify feature importance for deep neural networks (DNN). In vision transformers (ViT), the self-attention mechanism naturally leads to attention maps, which are sometimes interpreted as importance scores that indicate which input features ViT models are focusing on. However, attention maps do not account for signals from downstream tasks. To generate explanations that are sensitive to downstream tasks, we have developed class-discriminative attention maps (CDAM), a gradient-based extension that estimates feature importance with respect to a known class or a latent concept. CDAM scales attention scores by how relevant the corresponding tokens are for the predictions of a classifier head. In addition to targeting the supervised classifier, CDAM can explain an arbitrary concept shared by selected samples by measuring similarity in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Advanced Neural Network Applications
