Explaining Neural Networks Semantically and Quantitatively
Runjin Chen, Hao Chen, Ge Huang, Jie Ren, and Quanshi Zhang

TL;DR
This paper introduces a method to interpret CNN predictions both semantically and quantitatively by distilling knowledge into an explainable additive model, enhancing understanding and practical application of neural networks.
Contribution
The paper proposes a novel knowledge distillation approach to create an explainable additive model from CNNs, addressing bias issues and providing quantitative explanations.
Findings
Effective in explaining CNN predictions
Addresses bias-interpreting problems in explainable models
Demonstrates improved interpretability in experiments
Abstract
This paper presents a method to explain the knowledge encoded in a convolutional neural network (CNN) quantitatively and semantically. The analysis of the specific rationale of each prediction made by the CNN presents a key issue of understanding neural networks, but it is also of significant practical values in certain applications. In this study, we propose to distill knowledge from the CNN into an explainable additive model, so that we can use the explainable model to provide a quantitative explanation for the CNN prediction. We analyze the typical bias-interpreting problem of the explainable model and develop prior losses to guide the learning of the explainable additive model. Experimental results have demonstrated the effectiveness of our method.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications
