Conceptual Learning via Embedding Approximations for Reinforcing Interpretability and Transparency
Maor Dikter, Tsachi Blau, Chaim Baskin

TL;DR
This paper introduces CLEAR, a framework that enhances interpretability in concept bottleneck models by approximating concept embeddings in a latent space, leading to improved transparency and state-of-the-art performance in image classification.
Contribution
CLEAR leverages score matching and Langevin sampling to approximate concept embeddings, enabling better interpretability and selection of concepts in vision-language models.
Findings
Achieved state-of-the-art results on multiple benchmarks.
Provided more transparent decision-making insights.
Improved concept selection process in CBMs.
Abstract
Concept bottleneck models (CBMs) have emerged as critical tools in domains where interpretability is paramount. These models rely on predefined textual descriptions, referred to as concepts, to inform their decision-making process and offer more accurate reasoning. As a result, the selection of concepts used in the model is of utmost significance. This study proposes \underline{\textbf{C}}onceptual \underline{\textbf{L}}earning via \underline{\textbf{E}}mbedding \underline{\textbf{A}}pproximations for \underline{\textbf{R}}einforcing Interpretability and Transparency, abbreviated as CLEAR, a framework for constructing a CBM for image classification. Using score matching and Langevin sampling, we approximate the embedding of concepts within the latent space of a vision-language model (VLM) by learning the scores associated with the joint distribution of images and concepts. A concept…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Adversarial Robustness in Machine Learning · Topic Modeling
