Attention cannot be an Explanation

Arjun R Akula; Song-Chun Zhu

arXiv:2201.11194·cs.HC·January 28, 2022

Attention cannot be an Explanation

Arjun R Akula, Song-Chun Zhu

PDF

Open Access

TL;DR

This paper critically examines whether attention mechanisms can serve as reliable explanations for neural network decisions, concluding through human studies that they are ineffective for increasing trust and reliance.

Contribution

The study provides empirical evidence that attention weights are not suitable explanations, even in cases where they correlate with feature importance, challenging their interpretability utility.

Findings

01

Attention weights are uncorrelated with feature importance in many cases.

02

Human studies show attention does not increase trust or reliance.

03

Attention cannot be used as an effective explanation.

Abstract

Attention based explanations (viz. saliency maps), by providing interpretability to black box models such as deep neural networks, are assumed to improve human trust and reliance in the underlying models. Recently, it has been shown that attention weights are frequently uncorrelated with gradient-based measures of feature importance. Motivated by this, we ask a follow-up question: "Assuming that we only consider the tasks where attention weights correlate well with feature importance, how effective are these attention based explanations in increasing human trust and reliance in the underlying models?". In other words, can we use attention as an explanation? We perform extensive human study experiments that aim to qualitatively and quantitatively assess the degree to which attention based explanations are suitable in increasing human trust and reliance. Our experiment results show that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Machine Learning in Materials Science · Adversarial Robustness in Machine Learning