A Case Study on Concept Induction for Neuron-Level Interpretability in CNN
Moumita Sen Sarma, Samatha Ereshi Akkamahadevi, Pascal Hitzler

TL;DR
This paper evaluates the generalizability of a concept induction framework for interpreting hidden neurons in CNNs by applying it to the SUN2012 scene recognition dataset, confirming its broader applicability.
Contribution
It demonstrates that a previously proposed neuron interpretability method effectively transfers from ADE20K to SUN2012, showing its potential for wider use in scene understanding.
Findings
Method successfully assigns semantic labels to neurons in SUN2012
Semantic labels are validated through web images and statistical tests
Framework shows broad applicability across datasets
Abstract
Deep Neural Networks (DNNs) have advanced applications in domains such as healthcare, autonomous systems, and scene understanding, yet the internal semantics of their hidden neurons remain poorly understood. Prior work introduced a Concept Induction-based framework for hidden neuron analysis and demonstrated its effectiveness on the ADE20K dataset. In this case study, we investigate whether the approach generalizes by applying it to the SUN2012 dataset, a large-scale scene recognition benchmark. Using the same workflow, we assign interpretable semantic labels to neurons and validate them through web-sourced images and statistical testing. Our findings confirm that the method transfers to SUN2012, showing its broader applicability.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Generative Adversarial Networks and Image Synthesis
