Hierarchical interpretations for neural network predictions
Chandan Singh, W. James Murdoch, Bin Yu

TL;DR
This paper introduces hierarchical interpretations for neural network predictions using agglomerative contextual decomposition, which helps visualize feature contributions, diagnose errors, identify biases, and improve trust in DNNs.
Contribution
The paper proposes a novel hierarchical interpretation method, ACD, that visualizes feature contributions in DNNs and enhances understanding and trust.
Findings
ACD effectively diagnoses incorrect predictions.
ACD identifies dataset bias and improves trust.
Hierarchy is robust to adversarial noise.
Abstract
Deep neural networks (DNNs) have achieved impressive predictive performance due to their ability to learn complex, non-linear relationships between variables. However, the inability to effectively visualize these relationships has led to DNNs being characterized as black boxes and consequently limited their applications. To ameliorate this problem, we introduce the use of hierarchical interpretations to explain DNN predictions through our proposed method, agglomerative contextual decomposition (ACD). Given a prediction from a trained DNN, ACD produces a hierarchical clustering of the input features, along with the contribution of each cluster to the final prediction. This hierarchy is optimized to identify clusters of features that the DNN learned are predictive. Using examples from Stanford Sentiment Treebank and ImageNet, we show that ACD is effective at diagnosing incorrect…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Generative Adversarial Networks and Image Synthesis
MethodsAgglomerative Contextual Decomposition
