TL;DR
This paper introduces a new method for interpreting Transformer models by computing relevancy scores through Deep Taylor Decomposition, effectively visualizing decision-making in both vision and text tasks.
Contribution
It presents a novel relevancy propagation technique for Transformers that surpasses existing explainability methods in accuracy and applicability.
Findings
Outperforms existing explainability methods on visual Transformer benchmarks
Effective relevancy propagation through attention layers and skip connections
Applicable to both vision and text Transformer models
Abstract
Self-attention techniques, and specifically Transformers, are dominating the field of text processing and are becoming increasingly popular in computer vision classification tasks. In order to visualize the parts of the image that led to a certain classification, existing methods either rely on the obtained attention maps or employ heuristic propagation along the attention graph. In this work, we propose a novel way to compute relevancy for Transformer networks. The method assigns local relevance based on the Deep Taylor Decomposition principle and then propagates these relevancy scores through the layers. This propagation involves attention layers and skip connections, which challenge existing methods. Our solution is based on a specific formulation that is shown to maintain the total relevancy across layers. We benchmark our method on very recent visual Transformer networks, as well…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Multi-Head Attention · Softmax · Residual Connection · Adam · Attention Is All You Need · Byte Pair Encoding · Layer Normalization
