Concepts Worth Having: Refining VLM-Guided Concept Bottleneck Models with Minimal Annotations

Nicola Debole; Andrea Passerini; Stefano Teso; Andrea Pugnana; Emanuele Marconato

arXiv:2605.16405·cs.CV·May 19, 2026

Concepts Worth Having: Refining VLM-Guided Concept Bottleneck Models with Minimal Annotations

Nicola Debole, Andrea Passerini, Stefano Teso, Andrea Pugnana, Emanuele Marconato

PDF

TL;DR

This paper introduces VH-CBM, a hybrid concept bottleneck model that combines vision-language models and minimal human annotations to improve concept quality and interpretability.

Contribution

VH-CBM employs a Gaussian Process to effectively propagate limited annotations, enhancing concept accuracy and interpretability over existing VLM-guided CBMs.

Findings

01

VH-CBM outperforms VLM-guided CBMs with as little as 1% annotations.

02

VH-CBM achieves better concept calibration.

03

Supports active learning for efficient annotation.

Abstract

Concept-bottleneck models (CBMs) are neural classifiers that compute predictions from high-level concepts extracted from the input. CBMs ensure stakeholders can understand the concepts -- and the predictions they entail -- by learning these from concept-level annotations, which are however seldom available. Recent CBM architectures work around this issue by obtaining annotations from Vision-Language Models (VLMs). While greatly broadening applicability, doing so can yield lower quality concepts and therefore less interpretable models. We strike for a middle ground by introducing Vision-plus-Human-guided CBM (VH-CBM), a hybrid approach that exploits both VLMs and a small amount of dense annotations. VH-CBM employs a Gaussian Process in the VLM's embedding space, which captures useful global information about the target domain, to propagate the expert's supervision to any target data…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.