Interpretations are useful: penalizing explanations to align neural   networks with prior knowledge

Laura Rieger; Chandan Singh; W. James Murdoch; Bin Yu

arXiv:1909.13584·cs.LG·October 9, 2020·37 cites

Interpretations are useful: penalizing explanations to align neural networks with prior knowledge

Laura Rieger, Chandan Singh, W. James Murdoch, Bin Yu

PDF

Open Access 4 Repos 1 Video

TL;DR

This paper introduces CDEP, a method that uses explanation penalization to improve neural network accuracy by correcting feature importance errors based on explanations, thus enabling actionable insights.

Contribution

The paper presents a novel explanation penalization technique that leverages existing explanation methods to enhance model accuracy and correct feature importance errors.

Findings

01

CDEP improves model performance on toy datasets.

02

CDEP corrects feature importance errors effectively.

03

The method enhances interpretability and accuracy simultaneously.

Abstract

For an explanation of a deep learning model to be effective, it must provide both insight into a model and suggest a corresponding action in order to achieve some objective. Too often, the litany of proposed explainable deep learning methods stop at the first step, providing practitioners with insight into a model, but no way to act on it. In this paper, we propose contextual decomposition explanation penalization (CDEP), a method which enables practitioners to leverage existing explanation methods in order to increase the predictive accuracy of deep learning models. In particular, when shown that a model has incorrectly assigned importance to some features, CDEP enables practitioners to correct these errors by directly regularizing the provided explanations. Using explanations provided by contextual decomposition (CD) (Murdoch et al., 2018), we demonstrate the ability of our method to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

Interpretations are Useful: Penalizing Explanations to Align Neural Networks with Prior Knowledge· slideslive

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Machine Learning in Healthcare

MethodsContextual Decomposition Explanation Penalization