Explanation on Pretraining Bias of Finetuned Vision Transformer
Bumjin Park, Jaesik Choi

TL;DR
This paper introduces a new interpretability tool called IAV to analyze the pretraining bias of ViT models, revealing that attention map properties are not dependent on pretraining type but IAV trends can distinguish them.
Contribution
The paper proposes Input-Attribution and Attention Score Vector (IAV) for analyzing attention map bias and pretraining effects in Vision Transformers, providing new insights into model interpretability.
Findings
Each ViT head has a specific agreement range on classification decisions.
Attention map properties are independent of pretraining type.
IAV trends can effectively distinguish different pretraining methods.
Abstract
As the number of fine tuning of pretrained models increased, understanding the bias of pretrained model is essential. However, there is little tool to analyse transformer architecture and the interpretation of the attention maps is still challenging. To tackle the interpretability, we propose Input-Attribution and Attention Score Vector (IAV) which measures the similarity between attention map and input-attribution and shows the general trend of interpretable attention patterns. We empirically explain the pretraining bias of supervised and unsupervised pretrained ViT models, and show that each head in ViT has a specific range of agreement on the decision of the classification. We show that generalization, robustness and entropy of attention maps are not property of pretraining types. On the other hand, IAV trend can separate the pretraining types.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCCD and CMOS Imaging Sensors · Visual Attention and Saliency Detection · Infrared Target Detection Methodologies
