A Song of (Dis)agreement: Evaluating the Evaluation of Explainable Artificial Intelligence in Natural Language Processing
Michael Neely, Stefan F. Schouten, Maurits Bleeker, Ana Lucic

TL;DR
This paper critically evaluates the common practice of using rank correlation to assess attention-based explanations in NLP, finding weak correlations and advocating for human-centered evaluation methods instead.
Contribution
It challenges the validity of using rank correlation for explanation evaluation and promotes human-in-the-loop testing for better interpretability assessment.
Findings
Attention explanations do not strongly correlate with feature attribution methods.
Different explanation methods do not correlate well with each other for transformer models.
Rank correlation is not a reliable metric for evaluating explanations.
Abstract
There has been significant debate in the NLP community about whether or not attention weights can be used as an explanation - a mechanism for interpreting how important each input token is for a particular prediction. The validity of "attention as explanation" has so far been evaluated by computing the rank correlation between attention-based explanations and existing feature attribution explanations using LSTM-based models. In our work, we (i) compare the rank correlation between five more recent feature attribution methods and two attention-based methods, on two types of NLP tasks, and (ii) extend this analysis to also include transformer-based models. We find that attention-based explanations do not correlate strongly with any recent feature attribution methods, regardless of the model or task. Furthermore, we find that none of the tested explanations correlate strongly with one…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Topic Modeling · Advanced Graph Neural Networks
MethodsALIGN
