TExplain: Explaining Learned Visual Features via Pre-trained (Frozen) Language Models
Saeid Asgari Taghanaki, Aliasghar Khani, Ali Saheb Pasand, Amir, Khasahmadi, Aditya Sanghi, Karl D.D. Willis, Ali Mahdavi-Amiri

TL;DR
TExplain leverages pre-trained language models to interpret learned features of image classifiers by generating explanatory sentences, revealing biases, correlations, and decision patterns to improve understanding and robustness.
Contribution
This paper introduces TExplain, a novel method that connects image classifier features with language models to generate explanations, a first in leveraging language models for visual feature interpretation.
Findings
Effectively explains learned features of classifiers.
Identifies biases and spurious correlations.
Enhances interpretability and robustness of models.
Abstract
Interpreting the learned features of vision models has posed a longstanding challenge in the field of machine learning. To address this issue, we propose a novel method that leverages the capabilities of language models to interpret the learned features of pre-trained image classifiers. Our method, called TExplain, tackles this task by training a neural network to establish a connection between the feature space of image classifiers and language models. Then, during inference, our approach generates a vast number of sentences to explain the features learned by the classifier for a given image. These sentences are then used to extract the most frequent words, providing a comprehensive understanding of the learned features and patterns within the classifier. Our method, for the first time, utilizes these frequent words corresponding to a visual representation to provide insights into the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Explainable Artificial Intelligence (XAI) · Topic Modeling
