On the Explainability of Natural Language Processing Deep Models
Julia El Zini, Mariette Awad

TL;DR
This paper surveys explainability methods for NLP deep models, categorizing approaches by what they explain and discussing evaluation practices, aiming to advance understanding and development of interpretability in NLP.
Contribution
It provides a comprehensive framework for classifying and evaluating explainability methods in NLP, including a case-study and future research directions.
Findings
Categorizes explainability methods into input, processing, and output levels.
Highlights the lack of standardized evaluation practices in NLP explainability.
Provides a case-study on neural machine translation models.
Abstract
While there has been a recent explosion of work on ExplainableAI ExAI on deep models that operate on imagery and tabular data, textual datasets present new challenges to the ExAI community. Such challenges can be attributed to the lack of input structure in textual data, the use of word embeddings that add to the opacity of the models and the difficulty of the visualization of the inner workings of deep models when they are trained on textual data. Lately, methods have been developed to address the aforementioned challenges and present satisfactory explanations on Natural Language Processing (NLP) models. However, such methods are yet to be studied in a comprehensive framework where common challenges are properly stated and rigorous evaluation practices and metrics are proposed. Motivated to democratize ExAI methods in the NLP field, we present in this work a survey that studies…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
