CoSy: Evaluating Textual Explanations of Neurons
Laura Kopf, Philine Lou Bommer, Anna Hedstr\"om, Sebastian Lapuschkin,, Marina M.-C. H\"ohne, Kirill Bykov

TL;DR
CoSy is a new framework that quantitatively evaluates the quality of textual explanations of neurons in deep neural networks by generating data points conditioned on explanations and comparing neuron responses.
Contribution
Introduces CoSy, an architecture-agnostic, generative-model-based framework for assessing the quality of textual neuron explanations in DNNs.
Findings
Significant differences in explanation quality across methods
Framework passes sanity checks
Effective benchmarking of neuron description methods
Abstract
A crucial aspect of understanding the complex nature of Deep Neural Networks (DNNs) is the ability to explain learned concepts within their latent representations. While methods exist to connect neurons to human-understandable textual descriptions, evaluating the quality of these explanations is challenging due to the lack of a unified quantitative approach. We introduce CoSy (Concept Synthesis), a novel, architecture-agnostic framework for evaluating textual explanations of latent neurons. Given textual explanations, our proposed framework uses a generative model conditioned on textual input to create data points representing the explanations. By comparing the neuron's response to these generated data points and control data points, we can estimate the quality of the explanation. We validate our framework through sanity checks and benchmark various neuron description methods for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsExplainable Artificial Intelligence (XAI)
