V2C-CBM: Building Concept Bottlenecks with Vision-to-Concept Tokenizer

Hangzhou He; Lei Zhu; Xinliang Zhang; Shuang Zeng; Qian Chen; Yanye Lu

arXiv:2501.04975·cs.CV·June 24, 2025

V2C-CBM: Building Concept Bottlenecks with Vision-to-Concept Tokenizer

Hangzhou He, Lei Zhu, Xinliang Zhang, Shuang Zeng, Qian Chen, Yanye Lu

PDF

Open Access 1 Repo

TL;DR

This paper introduces V2C-CBM, a novel concept bottleneck model that constructs visual concepts directly from multimodal models using a vision-to-concept tokenizer, enhancing interpretability and accuracy without extensive annotations.

Contribution

The paper proposes a new method to build concept bottlenecks directly from multimodal models using a vision-to-concept tokenizer, reducing annotation needs and improving performance.

Findings

01

V2C-CBM matches or outperforms LLM-supervised CBMs on benchmarks.

02

The approach is training efficient and highly interpretable.

03

Constructs explicit visual concepts from unlabeled images.

Abstract

Concept Bottleneck Models (CBMs) offer inherent interpretability by initially translating images into human-comprehensible concepts, followed by a linear combination of these concepts for classification. However, the annotation of concepts for visual recognition tasks requires extensive expert knowledge and labor, constraining the broad adoption of CBMs. Recent approaches have leveraged the knowledge of large language models to construct concept bottlenecks, with multimodal models like CLIP subsequently mapping image features into the concept feature space for classification. Despite this, the concepts produced by language models can be verbose and may introduce non-visual attributes, which hurts accuracy and interpretability. In this study, we investigate to avoid these issues by constructing CBMs directly from multimodal models. To this end, we adopt common words as base concept…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

riverback/v2c-cbm
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCloud Computing and Resource Management · Machine Learning and Data Classification · Scientific Computing and Data Management

MethodsADaptive gradient method with the OPTimal convergence rate · Contrastive Language-Image Pre-training · Balanced Selection