TextCAVs: Debugging vision models using text
Angus Nicolson, Yarin Gal, J. Alison Noble

TL;DR
TextCAVs introduces a cost-effective method for explaining vision models by creating concept activation vectors using text descriptions with vision-language models, enabling easier debugging without image annotations.
Contribution
The paper presents a novel approach to generate concept activation vectors solely from text descriptions using vision-language models, reducing the need for labeled image data.
Findings
TextCAVs produces reasonable explanations for chest x-ray and natural image datasets.
The explanations can be used effectively to debug deep learning models.
Method enables testing many concepts quickly without image annotation delays.
Abstract
Concept-based interpretability methods are a popular form of explanation for deep learning models which provide explanations in the form of high-level human interpretable concepts. These methods typically find concept activation vectors (CAVs) using a probe dataset of concept examples. This requires labelled data for these concepts -- an expensive task in the medical domain. We introduce TextCAVs: a novel method which creates CAVs using vision-language models such as CLIP, allowing for explanations to be created solely using text descriptions of the concept, as opposed to image exemplars. This reduced cost in testing concepts allows for many concepts to be tested and for users to interact with the model, testing new ideas as they are thought of, rather than a delay caused by image collection and annotation. In early experimental results, we demonstrate that TextCAVs produces reasonable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
MethodsContrastive Language-Image Pre-training
