LLM-assisted Concept Discovery: Automatically Identifying and Explaining Neuron Functions
Nhat Hoang-Xuan, Minh Vu, My T. Thai

TL;DR
This paper introduces a method using multimodal large language models to automatically discover and validate novel, interpretable neuron concepts in deep neural networks, enhancing understanding without manual concept specification.
Contribution
The paper presents an automated, open-ended approach for concept discovery in DNNs using multimodal LLMs, eliminating the need for pre-defined concepts or manual curation.
Findings
Discovered novel neuron concepts without pre-defined sets
Validated concepts through example and counterexample generation
Produced more faithful explanations of neuron behavior
Abstract
Providing textual concept-based explanations for neurons in deep neural networks (DNNs) is of importance in understanding how a DNN model works. Prior works have associated concepts with neurons based on examples of concepts or a pre-defined set of concepts, thus limiting possible explanations to what the user expects, especially in discovering new concepts. Furthermore, defining the set of concepts requires manual work from the user, either by directly specifying them or collecting examples. To overcome these, we propose to leverage multimodal large language models for automatic and open-ended concept discovery. We show that, without a restricted set of pre-defined concepts, our method gives rise to novel interpretable concepts that are more faithful to the model's behavior. To quantify this, we validate each concept by generating examples and counterexamples and evaluating the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques · Topic Modeling · Biomedical Text Mining and Ontologies
MethodsSparse Evolutionary Training
