Understanding CNN Hidden Neuron Activations Using Structured Background Knowledge and Deductive Reasoning
Abhilekha Dalal, Md Kamruzzaman Sarker, Adrita Barua, Eugene, Vasserman, Pascal Hitzler

TL;DR
This paper introduces a method for interpreting hidden neuron activations in CNNs by leveraging large-scale background knowledge and deductive reasoning, providing meaningful labels to neurons and demystifying the black-box nature of deep learning.
Contribution
It presents a novel automated approach that uses background knowledge and symbolic reasoning to interpret CNN hidden neurons, advancing explainability in AI.
Findings
Automatically attaches meaningful labels to neurons
Uses background knowledge from Wikipedia hierarchy
Demonstrates effective hypothesis and verification process
Abstract
A major challenge in Explainable AI is in correctly interpreting activations of hidden neurons: accurate interpretations would provide insights into the question of what a deep learning system has internally detected as relevant on the input, demystifying the otherwise black-box character of deep learning systems. The state of the art indicates that hidden node activations can, in some cases, be interpretable in a way that makes sense to humans, but systematic automated methods that would be able to hypothesize and verify interpretations of hidden neuron activations are underexplored. In this paper, we provide such a method and demonstrate that it provides meaningful interpretations. Our approach is based on using large-scale background knowledge approximately 2 million classes curated from the Wikipedia concept hierarchy together with a symbolic reasoning approach called Concept…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Topic Modeling · Adversarial Robustness in Machine Learning
