Towards Faithful Multimodal Concept Bottleneck Models

Pierre Moreau; Emeline Pineau Ferrand; Yann Choho; Benjamin Wong; Annabelle Blangero; Milan Bhan

arXiv:2603.13163·cs.CV·March 16, 2026

Towards Faithful Multimodal Concept Bottleneck Models

Pierre Moreau, Emeline Pineau Ferrand, Yann Choho, Benjamin Wong, Annabelle Blangero, Milan Bhan

PDF

Open Access

TL;DR

This paper introduces f-CBM, a multimodal concept bottleneck model that jointly improves concept detection and reduces leakage, achieving a better balance between accuracy and interpretability across image and text datasets.

Contribution

f-CBM is a novel framework that combines a leakage loss and an expressive prediction head to enhance faithfulness in multimodal concept bottleneck models.

Findings

01

f-CBM achieves superior trade-offs between accuracy, concept detection, and leakage reduction.

02

The model is versatile, working effectively on both image and text datasets.

03

It outperforms existing methods by jointly addressing detection and leakage issues.

Abstract

Concept Bottleneck Models (CBMs) are interpretable models that route predictions through a layer of human-interpretable concepts. While widely studied in vision and, more recently, in NLP, CBMs remain largely unexplored in multimodal settings. For their explanations to be faithful, CBMs must satisfy two conditions: concepts must be properly detected, and concept representations must encode only their intended semantics, without smuggling extraneous task-relevant or inter-concept information into final predictions, a phenomenon known as leakage. Existing approaches treat concept detection and leakage mitigation as separate problems, and typically improve one at the expense of predictive accuracy. In this work, we introduce f-CBM, a faithful multimodal CBM framework built on a vision-language backbone that jointly targets both aspects through two complementary strategies: a differentiable…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Explainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning