Embedding Deep Networks into Visual Explanations
Zhongang Qi, Saeed Khorram, Fuxin Li

TL;DR
This paper introduces a novel explanation method for deep networks that learns a low-dimensional, faithful representation of high-dimensional activations, enabling better human understanding and improved classification performance.
Contribution
The paper proposes the Explanation Neural Network (XNN) with Sparse Reconstruction Autoencoder (SRAE) for faithful, interpretable embeddings of deep network activations, enhancing explanation quality.
Findings
Outperforms saliency map baselines in human studies
Improves human performance on complex classification tasks
Introduces new metrics for quantitative explanation evaluation
Abstract
In this paper, we propose a novel Explanation Neural Network (XNN) to explain the predictions made by a deep network. The XNN works by learning a nonlinear embedding of a high-dimensional activation vector of a deep network layer into a low-dimensional explanation space while retaining faithfulness i.e., the original deep learning predictions can be constructed from the few concepts extracted by our explanation network. We then visualize such concepts for human to learn about the high-level concepts that the deep network is using to make decisions. We propose an algorithm called Sparse Reconstruction Autoencoder (SRAE) for learning the embedding to the explanation space. SRAE aims to reconstruct part of the original feature space while retaining faithfulness. A pull-away term is applied to SRAE to make the bases of the explanation space more orthogonal to each other. A visualization…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSolana Customer Service Number +1-833-534-1729
