On Attribution of Recurrent Neural Network Predictions via Additive   Decomposition

Mengnan Du; Ninghao Liu; Fan Yang; Shuiwang Ji; Xia Hu

arXiv:1903.11245·cs.CL·March 28, 2019·1 cites

On Attribution of Recurrent Neural Network Predictions via Additive Decomposition

Mengnan Du, Ninghao Liu, Fan Yang, Shuiwang Ji, Xia Hu

PDF

Open Access

TL;DR

This paper introduces REAT, a novel additive decomposition method that enhances the interpretability of RNNs by providing faithful, phrase-level attribution scores for predictions, applicable across various RNN architectures.

Contribution

The paper proposes REAT, a flexible attribution method that decomposes RNN predictions into additive contributions of words and phrases, improving interpretability and faithfulness over existing approaches.

Findings

01

REAT provides faithful, interpretable attributions for RNN predictions.

02

The method is applicable to various RNN architectures including GRU and LSTM.

03

Analysis reveals linguistic knowledge captured by RNNs and potential for debugging.

Abstract

RNN models have achieved the state-of-the-art performance in a wide range of text mining tasks. However, these models are often regarded as black-boxes and are criticized due to the lack of interpretability. In this paper, we enhance the interpretability of RNNs by providing interpretable rationales for RNN predictions. Nevertheless, interpreting RNNs is a challenging problem. Firstly, unlike existing methods that rely on local approximation, we aim to provide rationales that are more faithful to the decision making process of RNN models. Secondly, a flexible interpretation method should be able to assign contribution scores to text segments of varying lengths, instead of only to individual words. To tackle these challenges, we propose a novel attribution method, called REAT, to provide interpretations to RNN predictions. REAT decomposes the final prediction of a RNN into additive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Software Engineering Research

MethodsInterpretability · Sigmoid Activation · Tanh Activation · Gated Recurrent Unit · Long Short-Term Memory