TL;DR
This study investigates how people interpret saliency-based explanations of text models, revealing common misunderstandings and proposing methods to improve explanation comprehension through adjusted saliency and visualization changes.
Contribution
It provides empirical insights into laypeople's interpretation of saliency explanations and introduces methods to enhance understanding by correcting perceptual distortions.
Findings
People often misinterpret explanations due to superficial factors like word length.
Adjusting saliency scores based on perception estimates improves understanding.
Using bar charts instead of heatmaps can reduce misinterpretation.
Abstract
While a lot of research in explainable AI focuses on producing effective explanations, less work is devoted to the question of how people understand and interpret the explanation. In this work, we focus on this question through a study of saliency-based explanations over textual data. Feature-attribution explanations of text models aim to communicate which parts of the input text were more influential than others towards the model decision. Many current explanation methods, such as gradient-based or Shapley value-based methods, provide measures of importance which are well-understood mathematically. But how does a person receiving the explanation (the explainee) comprehend it? And does their understanding match what the explanation attempted to communicate? We empirically investigate the effect of various factors of the input, the feature-attribution explanation, and visualization…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsHeatmap
