DAVE: Distribution-aware Attribution via ViT Gradient Decomposition
Adam Wr\'obel, Siddhartha Gairola, Jacek Tabor, Bernt Schiele, Bartosz Zieli\'nski, Dawid Rymarczyk

TL;DR
DAVE is a novel attribution method for Vision Transformers that decomposes gradients to produce stable, high-resolution explanations by isolating architecture-induced artifacts from meaningful signals.
Contribution
It introduces a mathematically grounded gradient decomposition technique tailored for ViTs, improving attribution stability and resolution.
Findings
Produces more stable attribution maps for ViTs
Separates architecture artifacts from true input signals
Enhances interpretability of Vision Transformer explanations
Abstract
Vision Transformers (ViTs) have become a dominant architecture in computer vision, yet producing stable and high-resolution attribution maps for these models remains challenging. Architectural components such as patch embeddings and attention routing often introduce structured artifacts in pixel-level explanations, causing many existing methods to rely on coarse patch-level attributions. We introduce DAVE \textit{(\underline{D}istribution-aware \underline{A}ttribution via \underline{V}iT Gradient D\underline{E}composition)}, a mathematically grounded attribution method for ViTs based on a structured decomposition of the input gradient. By exploiting architectural properties of ViTs, DAVE isolates locally equivariant and stable components of the effective input--output mapping. It separates these from architecture-induced artifacts and other sources of instability.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Neural Network Applications · Adversarial Robustness in Machine Learning
