LINE: LLM-based Iterative Neuron Explanations for Vision Models
Vladimir Zaigrajew, Micha{\l} Piechota, Gaspar Sekula, Pawe{\l} Gelar, Przemys{\l}aw Biecek

TL;DR
LINE is a novel, training-free, black-box method that uses large language models and text-to-image generators to iteratively label and interpret neurons in vision models with open-vocabulary concepts.
Contribution
It introduces a new approach for neuron explanation that does not rely on predefined vocabularies and improves interpretability in vision models.
Findings
Achieves state-of-the-art AUC improvements of up to 0.11 on ImageNet.
Discovers 27% more concepts missed by predefined vocabularies.
Provides visual explanations comparable to gradient-based methods.
Abstract
Interpreting individual neurons in deep neural networks is a crucial step towards understanding their complex decision-making processes and ensuring AI safety. Despite recent progress in neuron labeling, existing methods often limit the search space to predefined concept vocabularies or produce overly specific descriptions that fail to capture higher-order, global concepts. We introduce LINE, a novel, training-free iterative approach tailored for open-vocabulary concept labeling in vision models. Operating in a strictly black-box setting, LINE leverages a large language model and a text-to-image generator to iteratively propose and refine concepts in a closed loop, guided by activation history. LINE achieves state-of-the-art performance across multiple model architectures, yielding AUC improvements of up to 0.11 on ImageNet and 0.05 on Places365, while discovering, on average, 27% of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
