Interpreting Deep Learning Models in Natural Language Processing: A   Review

Xiaofei Sun; Diyi Yang; Xiaoya Li; Tianwei Zhang; Yuxian Meng; Han; Qiu; Guoyin Wang; Eduard Hovy; Jiwei Li

arXiv:2110.10470·cs.CL·October 26, 2021·25 cites

Interpreting Deep Learning Models in Natural Language Processing: A Review

Xiaofei Sun, Diyi Yang, Xiaoya Li, Tianwei Zhang, Yuxian Meng, Han, Qiu, Guoyin Wang, Eduard Hovy, Jiwei Li

PDF

Open Access

TL;DR

This paper reviews various interpretation methods for neural network models in NLP, highlighting their categories, sub-categories, limitations, and future research directions to improve model interpretability.

Contribution

It provides a comprehensive taxonomy and detailed analysis of existing interpretation methods for neural NLP models, identifying gaps and proposing future research avenues.

Findings

01

High-level taxonomy of interpretation methods in NLP

02

Detailed description of sub-categories like influence functions and attention

03

Identification of deficiencies and future research directions

Abstract

Neural network models have achieved state-of-the-art performances in a wide range of natural language processing (NLP) tasks. However, a long-standing criticism against neural network models is the lack of interpretability, which not only reduces the reliability of neural NLP systems but also limits the scope of their applications in areas where interpretability is essential (e.g., health care applications). In response, the increasing interest in interpreting neural NLP models has spurred a diverse array of interpretation methods over recent years. In this survey, we provide a comprehensive review of various interpretation methods for neural models in NLP. We first stretch out a high-level taxonomy for interpretation methods in NLP, i.e., training-based approaches, test-based approaches, and hybrid approaches. Next, we describe sub-categories in each category in detail, e.g.,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning