Towards Faithful Multimodal Concept Bottleneck Models
Pierre Moreau, Emeline Pineau Ferrand, Yann Choho, Benjamin Wong, Annabelle Blangero, Milan Bhan

TL;DR
This paper introduces f-CBM, a multimodal concept bottleneck model that jointly improves concept detection and reduces leakage, achieving a better balance between accuracy and interpretability across image and text datasets.
Contribution
f-CBM is a novel framework that combines a leakage loss and an expressive prediction head to enhance faithfulness in multimodal concept bottleneck models.
Findings
f-CBM achieves superior trade-offs between accuracy, concept detection, and leakage reduction.
The model is versatile, working effectively on both image and text datasets.
It outperforms existing methods by jointly addressing detection and leakage issues.
Abstract
Concept Bottleneck Models (CBMs) are interpretable models that route predictions through a layer of human-interpretable concepts. While widely studied in vision and, more recently, in NLP, CBMs remain largely unexplored in multimodal settings. For their explanations to be faithful, CBMs must satisfy two conditions: concepts must be properly detected, and concept representations must encode only their intended semantics, without smuggling extraneous task-relevant or inter-concept information into final predictions, a phenomenon known as leakage. Existing approaches treat concept detection and leakage mitigation as separate problems, and typically improve one at the expense of predictive accuracy. In this work, we introduce f-CBM, a faithful multimodal CBM framework built on a vision-language backbone that jointly targets both aspects through two complementary strategies: a differentiable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Explainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning
