On the Value of Labeled Data and Symbolic Methods for Hidden Neuron Activation Analysis

Abhilekha Dalal; Rushrukh Rayan; Adrita Barua; Eugene Y. Vasserman; Md Kamruzzaman Sarker; Pascal Hitzler

arXiv:2404.13567·cs.AI·February 23, 2026·1 cites

On the Value of Labeled Data and Symbolic Methods for Hidden Neuron Activation Analysis

Abhilekha Dalal, Rushrukh Rayan, Adrita Barua, Eugene Y. Vasserman, Md Kamruzzaman Sarker, Pascal Hitzler

PDF

Open Access

TL;DR

This paper introduces a novel, model-agnostic explainable AI method that uses a large concept hierarchy and OWL-reasoning to interpret hidden neuron activations in deep neural networks, enhancing understanding of internal representations.

Contribution

The paper presents a new symbolic, background knowledge-based approach for interpreting hidden neuron activations, demonstrating its effectiveness compared to existing explainable AI methods.

Findings

01

Automatically attaches meaningful class expressions to neurons

02

Provides competitive explanations in both quantitative and qualitative evaluations

03

Utilizes a Wikipedia-derived concept hierarchy with OWL-reasoning for explanation generation

Abstract

A major challenge in Explainable AI is in correctly interpreting activations of hidden neurons: accurate interpretations would help answer the question of what a deep learning system internally detects as relevant in the input, demystifying the otherwise black-box nature of deep learning systems. The state of the art indicates that hidden node activations can, in some cases, be interpretable in a way that makes sense to humans, but systematic automated methods that would be able to hypothesize and verify interpretations of hidden neuron activations are underexplored. This is particularly the case for approaches that can both draw explanations from substantial background knowledge, and that are based on inherently explainable (symbolic) methods. In this paper, we introduce a novel model-agnostic post-hoc Explainable AI method demonstrating that it provides meaningful interpretations.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Machine Learning in Materials Science · Computational Drug Discovery Methods