Explaining the Road Not Taken
Hua Shen, Ting-Hao 'Kenneth' Huang

TL;DR
This paper reviews explanation methods for NLP models, highlighting that most current approaches fail to address users' questions about why a model chose one result over a similar alternative.
Contribution
It provides a comprehensive analysis of explanation techniques in NLP and evaluates their effectiveness in answering user questions about model decisions.
Findings
Most explanation methods cannot answer 'road not taken' questions.
Users are interested in understanding why a model preferred one result over a similar alternative.
Current explanations often do not meet user needs for interpretability.
Abstract
It is unclear if existing interpretations of deep neural network models respond effectively to the needs of users. This paper summarizes the common forms of explanations (such as feature attribution, decision rules, or probes) used in over 200 recent papers about natural language processing (NLP), and compares them against user questions collected in the XAI Question Bank. We found that although users are interested in explanations for the road not taken -- namely, why the model chose one result and not a well-defined, seemly similar legitimate counterpart -- most model interpretations cannot answer these questions.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Topic Modeling · Machine Learning in Healthcare
